THESIS: NATIONAL CORPORA AND THEIR SIGNIFICANCE IN LINGUISTICS

Authors

  • Karimboyeva Madinabonu Author
  • Abdullajonova Hakima Author

Keywords:

national corpus, corpus linguistics, linguistic data, language analysis, annotation, lexicography, sociolinguistics, language teaching, computational linguistics, natural language processing, language policy, empirical research, parallel corpora, spoken corpus, digital linguistics

Abstract

This article examines the concept of national corpora and their growing importance in contemporary linguistics. A national corpus is defined as a large, structured, and electronically stored collection of authentic texts representing the language of a specific nation or speech community. The paper discusses the theoretical foundations of corpus linguistics, the structural components of national corpora, and notable examples from different countries. Special attention is given to the significance of corpora in linguistic research, lexicography, language teaching, sociolinguistics, translation studies, and computational linguistics. The article also highlights the challenges involved in corpus construction, such as data collection, annotation, and copyright issues. Finally, it outlines future directions for corpus development in light of technological advancements. The study demonstrates that national corpora serve as essential tools for understanding language use, guiding language policy, and fostering the development of modern linguistic technologies.

References

1. Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics: Investigating language structure and use. Cambridge University Press.

2. Gries, S. T. (2009). Statistics for linguistics with R: A practical introduction. De Gruyter Mouton.

3. Hunston, S. (2002). Corpora in applied linguistics. Cambridge University Press.

4. Kennedy, G. (1998). An introduction to corpus linguistics. Routledge.

5. Leech, G. (1991). The state of the art in corpus linguistics. In K. Aijmer & B. Altenberg (Eds.), English corpus linguistics (pp. 8–29). Longman.

6. McEnery, T., & Hardie, A. (2012). Corpus linguistics: Method, theory and practice. Cambridge University Press.

7. McEnery, T., & Wilson, A. (2001). Corpus linguistics: An introduction (2nd ed.). Edinburgh University Press.

8. Meyer, C. F. (2002). English corpus linguistics: An introduction. Cambridge University Press.

9. O’Keeffe, A., McCarthy, M., & Carter, R. (2007). From corpus to classroom: Language use and language teaching. Cambridge University Press.

10. Sinclair, J. (1991). Corpus, concordance, collocation. Oxford University Press.

11. Sinclair, J. (2004). Trust the text: Language, corpus and discourse. Routledge.

12. Tognini-Bonelli, E. (2001). Corpus linguistics at work. John Benjamins.

13. Vintar, Š., & Fišer, D. (2011). Compilation, annotation and application of a

14. national corpus: The Slovenian case. International Journal of Lexicography, 24(2), 119–134. https://doi.org/10.1093/ijl/ecq040

15. Xudoyberganova, M. (2020). Development of the Uzbek National Corpus: Problems and prospects. Uzbek Journal of Applied Linguistics, 5(1), 45–53.

Published

2025-12-06

How to Cite

[1]
2025. THESIS: NATIONAL CORPORA AND THEIR SIGNIFICANCE IN LINGUISTICS. Ustozlar uchun. 85, 2 (Dec. 2025), 187–194.