BAG OF WORDS VA WORD2VES MODELLARI TAQQOSLANISHI
Keywords:
Kalit so‘zlar:Bag of Words, Word2Vec, NLP, vektorlashtirish, semantika, kontekst, CBOW, Skip-gram, TFIDF, neyron tarmoqlar.Abstract
Annotatsiya
Ushbu maqolada tabiiy tilni qayta ishlash jarayonida matnni vektor ko‘rinishiga
keltirishning ikki asosiy yondashuvi — Bag of Words va Word2Vec modellari batafsil
tahlil qilinadi. Bag of Words modeli so‘zlarning chastotasiga asoslangan sodda statistik
yondashuv bo‘lib, matndagi semantik ma’noni aks ettirmaydi va lug‘at hajmining
kattaligi sababli o‘lcham muammosiga ega. Word2Vec modeli esa neyron tarmoqlar
asosida ishlaydi va so‘zlar orasidagi kontekstual hamda semantik aloqalarni o‘rganib,
ma’noga ega zich vektorlar yaratadi. Annotatsiyada har ikkala modelning ishlash
mexanizmi, afzallik va cheklovlari, amaliy qo‘llanish imkoniyatlari hamda zamonaviy
NLP tizimlarida ularning o‘rni yoritilgan. Maqola ushbu modellarni tanlashda vazifa
murakkabligi, resurs talabi va semantik chuqurlik kabi mezonlarning ahamiyatini
ochib beradi.
References
Foydalanilgan adabiyotlar
1. Mikolov, T., Chen, K., Corrado, G., & Dean, J. “Efficient Estimation of Word
Representations in Vector Space.” arXiv preprint arXiv:1301.3781, 2013.
2. Jurafsky, D., & Martin, J. H. “Speech and Language Processing.” Pearson, 3rd
Edition Draft, 2023.
3. Manning, C. D., Raghavan, P., & Schütze, H. “Introduction to Information
Retrieval.” Cambridge University Press, 2008.
4. Goldberg, Y. “Neural Network Methods for Natural Language Processing.”
Morgan & Claypool Publishers, 2017.
5. Harris, Z. “Distributional Structure.” Word, 1954.
6. Rong, X. “Word2Vec Parameter Learning Explained.” arXiv:1411.2738, 2014.