--历史日志归档--
04/01/2024 - 05/01/2024 12/01/2023 - 01/01/2024 11/01/2023 - 12/01/2023 08/01/2020 - 09/01/2020 07/01/2020 - 08/01/2020 12/01/2019 - 01/01/2020 11/01/2019 - 12/01/2019 10/01/2019 - 11/01/2019 07/01/2019 - 08/01/2019 06/01/2019 - 07/01/2019 02/01/2019 - 03/01/2019 01/01/2019 - 02/01/2019 12/01/2018 - 01/01/2019 11/01/2018 - 12/01/2018 05/01/2018 - 06/01/2018 10/01/2017 - 11/01/2017 09/01/2017 - 10/01/2017 04/01/2017 - 05/01/2017 03/01/2017 - 04/01/2017 12/01/2016 - 01/01/2017 12/01/2014 - 01/01/2015 11/01/2013 - 12/01/2013 05/01/2013 - 06/01/2013 08/01/2010 - 09/01/2010 07/01/2010 - 08/01/2010 12/01/2009 - 01/01/2010 11/01/2009 - 12/01/2009 10/01/2009 - 11/01/2009 09/01/2009 - 10/01/2009 08/01/2009 - 09/01/2009 07/01/2009 - 08/01/2009 06/01/2009 - 07/01/2009 05/01/2009 - 06/01/2009 04/01/2009 - 05/01/2009 03/01/2009 - 04/01/2009 02/01/2009 - 03/01/2009 01/01/2009 - 02/01/2009 12/01/2008 - 01/01/2009 11/01/2008 - 12/01/2008 10/01/2008 - 11/01/2008 09/01/2008 - 10/01/2008 08/01/2008 - 09/01/2008 07/01/2008 - 08/01/2008 06/01/2008 - 07/01/2008 05/01/2008 - 06/01/2008 03/01/2008 - 04/01/2008 02/01/2008 - 03/01/2008 01/01/2008 - 02/01/2008 12/01/2007 - 01/01/2008 11/01/2007 - 12/01/2007 10/01/2007 - 11/01/2007 09/01/2007 - 10/01/2007 08/01/2007 - 09/01/2007 06/01/2007 - 07/01/2007 05/01/2007 - 06/01/2007 04/01/2007 - 05/01/2007 03/01/2007 - 04/01/2007 02/01/2007 - 03/01/2007 01/01/2007 - 02/01/2007 12/01/2006 - 01/01/2007 11/01/2006 - 12/01/2006 10/01/2006 - 11/01/2006 09/01/2006 - 10/01/2006 08/01/2006 - 09/01/2006 07/01/2006 - 08/01/2006 06/01/2006 - 07/01/2006 05/01/2006 - 06/01/2006 04/01/2006 - 05/01/2006 03/01/2006 - 04/01/2006 02/01/2006 - 03/01/2006 01/01/2006 - 02/01/2006 12/01/2005 - 01/01/2006 11/01/2005 - 12/01/2005 10/01/2005 - 11/01/2005 09/01/2005 - 10/01/2005 08/01/2005 - 09/01/2005 07/01/2005 - 08/01/2005 06/01/2005 - 07/01/2005 05/01/2005 - 06/01/2005 04/01/2005 - 05/01/2005 03/01/2005 - 04/01/2005 02/01/2005 - 03/01/2005 01/01/2005 - 02/01/2005 12/01/2004 - 01/01/2005 11/01/2004 - 12/01/2004 10/01/2004 - 11/01/2004 09/01/2004 - 10/01/2004 07/01/2004 - 08/01/2004 04/01/2004 - 05/01/2004 03/01/2004 - 04/01/2004 02/01/2004 - 03/01/2004 01/01/2004 - 02/01/2004 08/01/2003 - 09/01/2003 04/01/2003 - 05/01/2003 03/01/2003 - 04/01/2003 02/01/2003 - 03/01/2003
Reward hasiblog
随感两则:tags和spam-哈斯日志
随感两则:tags和spam
星期五, 四月 15, 2005
谈谈Tags Gmail支持标签式的邮件分类策略使得标签分类开始为众人所知,之后相继有国外的美味书签,furl网摘,flickr相册等的加盟,使得分众分类得以迅速扩展。最近相继furl,flickr被招安,Del.icio.us 又获得了投资人的青睐 ,而且其主人也计划专心为之工作,Yahoo!在360种使用标签式分类,365key 推出了支持tag式的标记和搜索功能等,今天ask jeeves也宣布其myJeeves的 历史纪录或网址收藏功能支持tag式保存,种种迹象显示分众分类又一次来潮了。但是一直以来,blog ,wiki ,以至标签分类 (或称分众分类 ,folksonomy )这些标榜着“草根化 ”的应用真正草根了么?都是谁在用这些所谓草根的应用,也就是这几个blogger,数都数的过来。 当然我从来不否认标签式分类是个好用的东东,它使得信息真正的个性化 ,但是好用的仅仅是因为他的方便、public和类聚,因而能成为一种social型的应用。但是他本身并不具备易理解性,特别是在操作上,几乎更少人知道可以把同义信息标记为不同的tag。tag分类要成体系成标准路还很长!反Spam进行时 搜索引擎观察 blog上贴了一篇stanford的学生对于这一分类学的研究的《A Taxonomy of Web Spam》论文(下载原文 )其中说到各种各样的垃圾网页他们通过自己建立链接,通过各种SEO的手段甚至作弊的手段获得在搜索引擎中的排名,但是当用户搜索相关query时,点击链接却不能给出用户需要的信息。 目前书签,网摘,blog,wikipedia都是直接或间接的是垃圾的重灾区。尽管搜索引擎支持链接的rel=nofollow属性,但是收效甚微。Zolta Gyongyi and Hector Garcia-Molina通过这篇论文试图能找到解决方案,他们认为通过算法能实现如下三个目标:1 建立一个spam种子集,通过内容识别或结构分析来实现实现程序识别spam,一旦识别就自动停止索引或抓取,甚至人工的删除spam的索引页;2 通过让spider自动识别spam然后就不再抓取;3 通过高质量内容的特别加权来平衡spam的对搜索结果的影响。
This Written at 四月 15, 2005 by loverty.
发表评论