Analysis of keywords of scientific production of researchers: the author as an indexer

Authors

DOI:

https://doi.org/10.5433/1981-8920.2020v25n3p332

Keywords:

Keyword, Indexing, Scientific production, Control of vocabulary, Controlled vocabulary, Lattes Curriculum

Abstract

Introduction: With the advent of the digital and web era, the keyword has become an essential representation product in open access information storage and retrieval systems. It is used in the extraction functions for the purpose of scientific identification of authors, bibliometric analysis, indicators of scientific impact, development of controlled vocabularies and other systems of knowledge organization. The attribution of keywords by the author in scientific publications is a practice of representing the content carried out when filling in metadata. This assignment goes through personalized subject analysis and depends on the author's specialty vocabulary, which, in general, has no guidance on standardization and vocabulary control. On the other hand, such subject metadata that receives the keyword does not undergo professional validation.
Objective: Analysis of keywords assigned by researchers to submit articles from journals indexed in Scopus and in the Portal Professores Unesp regarding the standardization and vocabulary control for different functions in information storage and retrieval systems.
Methodology: For this purpose, an exploratory research was carried out with an observation study and analysis of keywords attributed by researchers from Portal Docentes Unesp based on data from the CNPq Lattes Curriculum compared to keywords attributed to journal articles.
Results: The results demonstrate an absence of standardization in the keywords of the journal articles of researchers registered in the Lattes Curriculum, both at syntactic and semantic level. Regarding the assessment of indexation, low levels of consistency are observed when compared to the original articles, both in the rigid index and in the relaxed / flexible index.
Conclusions: There is a need to develop a policy for organizing and representing information that provides guidelines to researchers regarding the attribution of keywords, aiming at greater standardization and consistency both in the representation and in the recovery of their scientific production.

Author Biographies

Mariângela Spotti Lopes Fujita, Universidade Estadual Paulista - UNESP

PhD in Communication Sciences from Universidade São Paulo - USP

Roberta Cristina Dal'Evedove Tartarotti, Universidade Estadual Paulista - UNESP

PhD in Information Science from the Universidade Estadual Paulista - UNESP

References

BALATSOUKAS, P.; ROUSIDIS, D.; GAROUFALLOU, E. A method for examining metadata quality in open research datasets using the OAI-PMH and SQL queries: the case of the Dublin Core ‘Subject’ element and suggestions for user-centred metadata annotation design. International Journal of Metadata, Semantics and Ontologies, v. 13, n. 1, p. 1-8, 2018. Disponível em: doi: 10.1504/IJMSO.2018.096444. Acesso em: 28 set. 2020.

BOGERS, T.; PETRAS, V. Tagging vs. Controlled vocabulary: which is more helpful for book search? In: iCONFERENCE 2015: create, collaborate, celebrate, 2015, Newport Beach. Proceeedings [...]. Newport Beach, 2015. p.1-15. Disponível em: Stateshttps://www.ideals.illinois.edu/bitstream/handle/2142/73673/65_ready.pdf?sequence=2. Acesso em: 12 set. 2020.

CHU, H. Information representation and retrieval in the digital age. 2. ed. Medford: Information Today, 2010. p.306. (ASIST Monographs Series).

CURRÍCULO Lattes. Portal do Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq). Disponível em: http://buscatextual.cnpq.br/buscatextual/busca.do?metodo=apresentar. Acesso em: 28 set. 2020.

FOSKETT, A. C. A abordagem temática da informação. Tradução: Antonio Agenor Briquet de Lemos. São Paulo: Polígono; Brasília: Universidade de Brasília, 1973. p. 437.

FROTA, M. G. C. Memória e registro das violações aos direitos da criança nos documentos da corte interamericana de direitos humanos. Tendências da Pesquisa Brasileira em Ciência da Informação, v. 7, n. 1, 2014. Disponível em: http://hdl.handle.net/20.500.11959/brapci/119497. Acesso em: 30 set. 2020.

FUJITA, M. S. L.; AGUSTÍN-LACRUZ; M.-D.-C.; TERRA, A. L. Journals’ guidelines about title, abstract and keywords: an overview of Information Science and Communication Science areas. European Science Editing, v. 44, p. 76–79, nov. 2018. Disponível em: https://europeanscienceediting.eu/articles/journals-guidelines-about-title-abstract-and-keywords-an-overview-of-information-science-and-communication-science-areas/. Acesso em: 28 set. 2020.

FUJITA, M. S. L.; TOLARE, J. Vocabulários controlados na representação e recuperação da informação em repositórios brasileiros. Informação & Informação, v. 24, p. 93-125, 2019. Disponível em: http://dx.doi.org/10.5433/1981-8920.2019v24n2p93. Acesso em: 15 set. 2020.

FUJITA, M. S. L. A representação documentária de artigos científicos em educação especial: orientação aos autores para determinação de palavras chaves. Revista Brasileira de Educação Especial, Marília, v. 10, n. 3, p. 257-272, set.-dez. 2004. Disponível em: https://www.researchgate.net/publication/277102627_A_representacao_documentaria_de_artigos_cientificos_em_educacao_especial_orientacao_aos_autores_para_determinacao_de_palavras_chaves. Acesso em: 15 set. 2020.

FUJITA, M. S. L.; AGUSTÍN-LACRUZ, M. C.; SILVA, A. L. Knowledge organization in editorial policies for titles, abstracts and keywords in JCR-indexed journals: an exploratory study in the areas of information and communication sciences. In:CHALLENGES AND OPPORTUNITIES FOR KNOWLEDGE ORGANIZATION IN THE DIGITAL AGE: INTERNATIONAL ISKO CONFERENCE, 15., Portugal, 2018. Proceedings[...]. Portugal: Ergon, 2018. p. 321-330.

GARCIA, D. C. F.; GATTAZ, C. C.; GATTAZ, N. C. A relevância do título, do resumo e de palavras-chave para a escrita de artigos científicos. Revista de Administração Contemporânea, v. 23, n. 3, maio/junho, 2019. Disponível em: www.scielo.br/pdf/rac/v23n3/1982-7849-rac-2019190178.pdf. Acesso em: 30 jun. 2020.

GIL LEIVA, I. Manual de indización: teoría y práctica. Gijón: Trea, 2008.

GIL-LEIVA, I.; ALONSO-ARROYO, A. Keywords given by authors of scientific articles in database descriptors. Journal of the American Society for Information Science and Technology, v. 58, n. 8, p. 1175–1187, 2007. Disponível em: https://doi.org/10.1002/asi.20595. Acesso em: 12 set. 2020.

GOLUB, K.; TYRKKÖ, J.; HANSSON, J.; AHLSTRÖM, I. Subject indexing in humanities: a comparison between a local university repository and an international bibliographic service. Journal of Documentation, v. 76, n. 6, p. 1193-1214. Disponível em: https://doi.org/10.1108/JD-12-2019-0231. Acesso em: 11 set. 2020.

GONÇALVES, A. L. Uso de resumos e palavras-chave em Ciências Sociais: uma avaliação. Encontros Bibli: revista eletrônica de Biblioteconomia e Ciência da Informação, Florianópolis, v. 13, n. 26, p. 78-93, out. 2008. Disponível em: https://periodicos.ufsc.br/index.php/eb/article/view/1518-2924.2008v13n26p78. Acesso em: 28 set. 2020.

GROSS, T., TAYLOR, A. G., JOUDREY, D. N. Still a lot to lose: the role of controlled vocabulary in keyword searching. Cataloging & Classification Quarterly, v. 53, n. 1, p. 1-39, 2015. Disponível em: doi: 10.1080/01639374.2014.917447. Acesso em: 01 out. 2020.

GROSS, T.; TAYLOR, A. G. What have we got to lose? The effect of controlled vocabulary on keyword searching results. College & Research Libraries, v. 66, n. 3, p. 212-230, may 2005. Disponível em: doi:https://doi.org/10.5860/crl.66.3.212. Acesso em: 12 set. 2020.

GULL, C. D. Information science and technology: from coordinate indexing to the global brain. Journal of the American Society for Information Science, v. 38, n. 5, p. 338-66, 1987.

HAN, M.-J. K.; HARRINGTON, P.; BLACK, A.; KUDEKI, D. Aligning author-supplied keywords for ETDS with domain-specific controlled vocabularies. In: HAN, M.-J. K.; HARRINGTON, P.; BLACK, A.; KUDEKI, D. Classification and Indexing Satellite Conference, 2016. p. 1-10. Disponível em: http://hdl.handle.net/2142/97879. Acesso em: 10 set. 2020.

HANRATH, S.; RADIO, E. User search terms and controlled subject vocabularies in an institutional repository. Library Hi Tech, v. 35, n. 3, p. 360-367, 2017. Disponível em: doi 10.1108/LHT-11-2016-0133. Acesso em: 11 set. 2020.

HIDER, P. The retrieval power added by subject indexing to bibliographic databases. In:CHALLENGES AND OPPORTUNITIES FOR KNOWLEDGE ORGANIZATION IN THE DIGITAL AGE: INTERNATIONAL ISKO CONFERENCE, 15., 2018, Portugal. Proceedings [...]. Portugal: Ergon, 2018. p. 426-431. (Advances in Knowledge Organization, v. 16).

HISTÓRIA do surgimento da Plataforma Lattes. Portal CNPq. Disponível em: https://bityli.com/krDV2. Acesso em: 29 set. 2020.

HORN, M. E. Garbage” in, “refuse and refuse disposal out: making the most of the subject authority file in the OPAC. Library Resources & Technical Services, v. 46, n. 3, p. 92-102, jul. 2002. Disponível em: https://doi.org/10.1007/s12109-018-9590-3. Acesso em: 02 out. 2020.

INDICADORES e métricas. Águia: Agência USP de Gestão da Informação Acadêmica da Universidade de São Paulo. Disponível em: https://bityli.com/7I0cr. Acesso em: 28 set. 2020.

INTERNATIONAL Federation of Library Associations and Institutions (IFLA). IFLA Library Reference Model: um modelo conceitual para a informação bibliográfica: definição de um modelo de referência conceitual para fornecer uma estrutura para a análise de metadados não administrativos relacionados aos recursos das bibliotecas. Tradução: Isabel Cristina Ayres da Silva Maringelli. Título original: RIVA, P.; LE BOEUF, P.; ŽUMER, M. IFLA Library Reference Model: a conceptual model for bibliographic information: definition of a conceptual reference model to provide a framework for the analysis of non-administrative metadata relating to library resources. Revised after world-wide review; endorsed by the IFLA Professional Committee. 2017. p. 101. Disponível em: https://www.ifla.org/publications/node/11412

JENNIFER, P.; MUTHUKUMARAVEL, J. P. Indexing on IR system by sing stemming and stopwords. International Journal of Recent Technology and Engineering (IJRTE), v. 8, n.1S2, p.281-283, mai. 2019. Disponível em: https://www.ijrte.org/wp-content/uploads/papers/v8i1S2/A00650581S219.pdf Acesso em: 28 set. 2020.

JOORABCHI A., MAHDI, A. E. Automatic subject metadata generation for scientific documents using wikipedia and genetic algorithms. In: TEN TEIJE A. INTERNATIONAL CONFERENCE ON KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (EKAW), 2012, Berlin. Proceeedings[...]. Berlin: Springer, 2012. Disponível em: https://doi.org/10.1007/978-3-642-33876-2_6. Acesso em: 28 set. 2020.

KIPP, M. User, author and professional indexing in context: an exploration of tagging practices on CiteULike. Canadian Journal of Information and Library Science, v. 35, n. 1, p. 1-41, 2009. Disponível em: https://www.researchgate.net/publication/46132798_User_Author_and_Professional_Indexing_in_Context_An_Exploration_of_Tagging_Practices_on_CiteULike. Acesso em: 15 set. 2020.

KIYOTA, Y. Automated subject induction from query keywords through wikipedia categories and subject headings. In: INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC’08), 2008, Marrakech. Proceedings [...]. Marrakech: European Language Resources Association (ELRA), 2008. Disponível em: http://www.lrec-conf.org/proceedings/lrec2008/pdf/882_paper.pdf. Acesso em: 15 set. 2020.

KUN, L.; KIPP, M. E. J. Understanding the retrieval effectiveness of collaborative tags and author keywords in different retrieval environments: an experimental study on medical collections. Journal of the Association for Information Science and Technology, v. 65, n. 3, p. 483–500, 2014. Disponível em: doi: 10.1002/asi.22985. Acesso em: 20 set. 2020.

LANCASTER, F. W. Indexação e resumos: teoria e prática. 2. ed. Tradução: Antonio Agenor de Briquet de Lemos. Brasília: Briquet de Lemos, 2004. p. 452 .

LU, W.; LI, X.; ZHIFENG, L.; CHENG, Q. How do author-selected keywords function semantically in scientific manuscripts? Knowledge Organization, v. 46, n. 6, p. 403-18, 2019. Disponível em: https://doi.org/10.5771/0943-7444-2019-6-402. Acesso em: 10 set. 2020.

MAURER, M. B.; SHAKERI, S. Disciplinary differences: LCSH and keyword assignment for ETDS from different disciplines. Cataloging & Classification Quarterly, v. 54, n. 4, p. 213-43, 2013. Disponível em: DOI: 10.1080/01639374.2016.1141133. Acesso em: 20 set. 2020.

MÓDULO Produção bibliográfica. Portal CNPq. Disponível em: https://bityli.com/YPVIk. Acesso em: 29 set. 2020.

MOULAISON, H. L. Social tagging in the web 2.0 environment: author vs. user tagging. Journal of Library Metadata, v. 8, n. 2, p. 101-111, 2008. Disponível em: DOI: 10.1080/10911360802087325. Acesso em: 15 set. 2020.

MUNAN, L. Classifying and ranking topic terms based on a novel approach: role differentiation of author keywords. Scientometrics, v. 116, p. 77–100, 2018. Disponível em: https://doi.org/10.1007/s11192-018-2741-7. Acesso em: 01 out. 2020.

NÉVÉOL, A.; DOĞAN, R. I.; ZHIYONG, L. Author keywords in biomedical journal articles. Amia annual symposium proceedings archive,p. 537-541, 2010. Disponível em: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041277/. Acesso em: 28 set. 2020.

NEWMAN, D.; HAGEDORN, K.; SMYTH, C. C. P. Subject metadata enrichment using statistical topic models. In: ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), 2007, Canada. Proceedings[...]. Canada, 2007. Disponível em: doi: 10.1145/1255175.1255248. Acesso em: 15 set. 2020.

PESET, F. Survival analysis of author keywords: an application to the library and information sciences area. Journal of the Association for Information Science and Technology, v. 71, n. 4, p. 462–473, 2020. Disponível em: doi: 10.1002/asi.24248. Acesso em: 20 set. 2020.

sem autor. PORTAL docentes Unesp é lançado com dados de 3.000 professores. Portal Unesp, 27 jun. 2019. Disponível em: https://www2.unesp.br/portal#!/noticia/34769/portal-docentes-unesp-e-lancado-com-dados-de-3000-professores. Acesso em: 25 set. 2020.

ROWLEY, J. The controlled versus natural indexing languages debate revisited: a perspective on information retrieval practice and research. Journal of Information Science, v. 20, n. 2, p. 108-118, 1994. Disponível em: doi:10.1177/016555159402000204. Acesso em: 18 set. 2020.

SANTOS, J. C. F.; CERVANTES, B. M. N. Controle de vocabulário em periódicos científicos eletrônicos. In: ENCONTRO NACIONAL DE PESQUISA EM CIÊNCIA DA INFORMAÇÃO, 16., 2015, João Pessoa. Anais[...] João Pessoa: UFPB, 2015. p.1-21. Disponível em: http://www.ufpb.br/evento/index.php/enancib2015/enancib2015/paper/viewFile/3068/997. Acesso em: 18 set. 2020.

SANTOS, R. F.; CORRÊA, R. F. Organização da informação em repositórios digitais: uma abordagem sobre a política de indexação da base de dados referencial de artigos de periódicos em Ciência da Informação (BRAPCI). In: GUIMARÃES, J. A. C. Memória, tecnologia e cultura na organização do conhecimento. Recife: UFPE, 2017. (Série: Estudos Avançados em Organização do Conhecimento, v. 4). p. 249-261. Disponível em: http://hdl.handle.net/20.500.11959/brapci/122077. Acesso em: 28 set. 2020.

SCHWING, T., MCCUTCHEON, S.; MAURER, M. B. Uniqueness matters: author-supplied keywords and LCSH in the library catalog. Cataloging & Classification Quarterly, v. 50, n. 8, p. 903-928, 2012. Disponível em: doi: 10.1080/01639374.2012.703164. Acesso em: 15 set. 2020.

SMIRAGLIA, R. P. Keywords, indexing, text analysis: an editorial. Knowledge Organization, v. 40, n. 3, p. 155-9, 2013.

TARTAROTTI, R. C. D. Avaliação do processo de indexação de assuntos em repositórios institucionais pela abordagem da recuperação da informação. 2019. Tese (Doutorado em Ciência da Informação) - Universidade Estadual Paulista, Marília, 2019. Disponível em: https://repositorio.unesp.br/handle/11449/191064 . Acesso em: 11 set. 2020.

TOEPFER, M.; SEIFERT, C. Descriptor-invariant fusion architectures for automatic subject indexing. In: ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), 2017, Toronto. Proceedings [...]. Toronto, 2017. p. 1-10. Disponível em: doi: 10.1109/JCDL.2017.7991557. Acesso em: 18 set. 2020.

VANYUSHKIN, A.; GRASCHENKO, L. Analysis of text collections for the purposes. Journal of Information and Organizational Sciences, v. 44, n.1, p.171-184, 2020. Disponível em: doi:10.31341/jios.44.1.8. Acesso em: 12 set. 2020.

WILLIS, C.; LEE, R. M. A random walk on an ontology: using thesaurus structure for automatic subject indexing. Journal of the American Society for Information Science and Technology, v. 64, n. 7, p.1330–1344, 2013. Disponível em: doi: 10.1002/asi.22853. Acesso em: 20 set. 2020.

YANG, L. Metadata effectiveness in internet discovery: an analysis of digital collection metadata elements and internet search engine keywords. College & Research Libraries, v. 77, n. 1, p. 7-19, jan. 2016. Disponível em: doi:https://doi.org/10.5860/crl.77.1.7. Acesso em: 18 set. 2020.

ZAVALINA, O. L. Contextual metadata in digital aggregations: application of collection-level subject metadata and its role in user interactions and information retrieval. Journal of Library Metadata, v. 11, n. 3-4, p. 104-128, 2011a. Disponível em: doi: 10.1080/19386389.2011.629957. Acesso em: 25 set. 2020.

ZAVALINA, O. L. Free-text collection-level subject metadata in large-scale digital libraries: a comparative content analysis. Dcmi international conference on dublin core and metadata applications, p. 147-157, 2011.

ZHANG, J. Comparing keywords plus of WOS and author keywords: a case study of patient adherence research. Journal of the for Information Science and Technology, v. 67, n. 4, p. 967–972, 2016. Disponível em: doi: 10.1002/asi.23437. Acesso em: 18 set. 2020.

Published

2020-10-31

How to Cite

Fujita, M. S. L., & Tartarotti, R. C. D. (2020). Analysis of keywords of scientific production of researchers: the author as an indexer. Informação & Informação, 25(3), 332–374. https://doi.org/10.5433/1981-8920.2020v25n3p332

Issue

Section

Artigos