American National Corpus 语料库介绍 上海交大“985 工程”外语学院二语习得平台建设

American National Corpus

2008-06-01 11:16 来源: 作者: 网友评论 0 条 浏览次数 3

American National Corpus

http://www.americannationalcorpus.org/

The American National Corpus (ANC) project is creating a massive electronic collection of American English, including texts of all genres and transcripts of spoken data produced from 1990 onward. The ANC will provide the most comprehensive picture of American English ever created, and will serve as a resource for education, linguistic and lexicographic research, and technology development.

When completed, the ANC will contain a core corpus of at least 100 million words, comparable across genres to the British National Corpus (BNC). The corpus will also include an "opportunistic" component of potentially several hundreds of millions of words, chosen to provide both the broadest and largest selection of texts (and, where available, annotations) possible.

The ANC has so far released 22 million words of American English, which is available from the Linguistic Data Consortium--please consult the LDC Catalog entry.

The ANC is working with annotation projects that are generating layers of annotation for some or all of the following: Penn Treebank-style syntactic annotations, PropBank, NomBank, TimeML, and opinion annotations. The data and annotations from these projects will be added to the ANC.

The ANC is working with annotation projects that are generating layers of annotation for some or all of the following: Penn Treebank-style syntactic annotations, PropBank, NomBank, TimeML, and opinion annotations. The data and annotations from these projects will be added to the ANC.

 
上一篇:The British Nati..    下一篇:CORPUS OFAMERICA..

相关主题:

网友评论