An authorship attribution model applied to pedophilia crime investigations
DOI:
https://doi.org/10.5433/1981-8920.2022v27n1p381Keywords:
Child and adolescent sexual abuse on the internet, Authorship attribution, Stylometry, Pedophilia, Police investigationAbstract
Objectives: Identify the current state of the art of scientific research in the field of authorship attribution applied to investigations of sexual crimes against children and adolescents over the Internet involving written material. Propose a methodology for using authorship attribution to identify suspected authors of texts with content that encourages child and adolescent sexual abuse.Methodology: This is a qualitative research that uses the Systematic Review of Literature to identify works that deal with the techniques of authorship attribution in order to seek scientific evidence of its application to problems similar to the one addressed in the present study.
Results: The current state of the art of scientific research that relates the use of authorship attribution techniques to texts on the internet that encourage the practice of sexual abuse of children and adolescents is presented and, from this, a methodology is proposed to identification of authors of texts with those characteristics.
Conclusions: It is concluded that there is not an abundance of scientific research on this topic, which suggests that it is an open field for further studies. It is also concluded that it is fully possible to apply the techniques of authorship attribution in the identification of the probable authors of texts that aim to guide and encourage the practice of child and adolescent sexual abuse, which is explained by the proposed methodology.
Downloads
References
ABBASI, A.; CHEN, H. Writeprints. ACM Transactions on Information Systems, [s. l.], v. 26, n. 2, p.1-29, 1 mar. 2008. DOI http://dx.doi.org/10.1145/1344411.1344413.
BHARGAVA, M.; MEHNDIRATTA, P.; ASAWA, K. Stylometric Analysis for Authorship Attribution on Twitter. Big Data Analytics, [s. l.], p. 37-47, 2013. Springer International Publishing. DOI http://dx.doi.org/10.1007/978-3-319-03689-2_3.
BRASIL. Decreto-lei n.º 2.848, de 7 de dezembro de 1940. Código penal. Disponível em: http://www.planalto.gov.br/ccivil_03/decreto-lei/del2848.htm. Acesso em: 19 maio 2021.
BRASIL. Lei nº 8.069, de 13 de julho de 1990. Dispõe sobre o Estatuto da Criança e do Adolescente e dá outras providências. Disponível em: http://www.planalto.gov.br/ccivil_03/leis/l8069.htm. Acesso em: 19 maio 2021.
CHILDHOOD BRASIL. Quem somos. 2021. Disponível em: https://www.childhood.org.br/quem-somos#intro. Acesso em: 18 maio 2021.
CHILDHOOD BRASIL. Números da causa. 2021. Disponível em: https://www.childhood.org.br/nossa-causa#numeros-da-causa. Acesso em: 18 maio 2021.
HADJIDJ, R.; DEBBABI, M.; LOUNIS, H.; IQBAL, F.; SZPORER, A.; BENREDJEM, D. Towards an integrated e-mail forensic analysis framework. Digital Investigation, v. 5, n. 3-4, p. 124-137, 2009. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S1742287609000036. Acesso em: 18 maio 2021.
ESCALANTE, H. J. Early detection of deception and aggressiveness using profile-based representations. Expert Systems with Applications, v. 89, p. 99-111, 2017. DOI https://doi.org/10.1016/j.eswa.2017.07.040.
FRANCO, D. P.; MAGALHÃES, S. R. A dark web: navegando no lado obscuro da Internet. Amazônia em Foco, Castanhal, v. 4, n. 6, p. 18-33, jan./jul. 2015. Disponível em: http://revista.fcat.edu.br/index.php/path/article/download/27/137. Acesso em: 20 jan. 2019.
GE, Z.; SUN, Y.; SMITH, M. J. T. Authorship attribution using a neural network language model. School of Electrical and Computer Engineering, p. 4212–4213, 2016. Disponível em: https://ojs.aaai.org/index.php/AAAI/article/view/9924/9783. Acesso em: 20 jan. 2019.
ISHIHARA, S. A comparative study of likelihood ratio based forensic text comparison procedures: multivariate Kernel Density with Lexical Features vs. Word N-grams vs. Character N-grams. In: CYBERCRIME AND TRUSTWORTHY COMPUTING CONFERENCE, 5., 2014, New Zealand. Anais [...].New Zealand, 2014. p. 1-11. Disponível em: https://openresearch-repository.anu.edu.au/handle/1885/102627. Acesso em: 16 jan. 2019.
ISHIHARA, S. Strength of linguistic text evidence: a fused forensic text comparison system. Forensic Science International, v. 278, p. 184-197, 2017. DOI 10.1016/j.forsciint.2017.06.040
KOURTIS, I.; STAMATATOS, E. Author identification using semi-supervised Learning Notebook for PAN at CLEF 2011. University of the Aegean, 2011.
MOREIRA, M. Análise de manuais de pedofilia na dark web para prevenção de crimes sexuais contra crianças e adolescentes. 2020. Dissertação (Mestrado em Ciência da Informação) – Universidade Federal de Santa Catarina, Florianópolis, SC, 2020.
PENG, F.; SCHUURMANS, D.; KESELJ, V.; WANG, S. Language independent authorship attribuition using character level language models. In: CONFERENCE ON EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 10., 2003, USA. Proceedings […]. USA: Association for Computational Linguistics, 2003. p. 267–274.
RAMNIAL, H.; PANCHOO, S.; PUDARUTH, S. Authorship attribution using stylometry and machine learning techniques. Advances in Intelligent Systems And Computing, [s. l.], p.113-125, 29 ago. 2015. DOI http://dx.doi.org/10.1007/978-3-319-23036-8_10.
ROCHA, A.; SCHEIRER, W.; FORSTALL, C.; CAVALCANTE, T.; THEOPHILO, A.; SHEN, B.; CARVALHO, A.; STAMATATOS, E. Authorship Attribuition for Social Media Forensics. IEEE Transactions on Information Forensics and Security, [s. l.], v. 12, n. 1, p.121-122, jan. 2017. DOI 10.1109/TIFS.2016.2603960.
SAFARNET. Institucional. Disponível em: https://new.safernet.org.br/content/institucional#mobile. Acesso em: 18 maio 2021.
STAMATATOS, E. A survey of modern authorship attribution methods. Journal of the American Society for Information Science and Technology, [s. l.], v. 60, n. 3, p.538-556, mar. 2009. DOI http://dx.doi.org/10.1002/asi.21001.
VILLAR-RODRIGUEZ, E. et al. A feature selection method for author identification in interactive communications based on supervised learning and language typicality. Engineering Applications of Artificial Intelligence, v. 56, p. 175-184, 2016.
YANG, M.; CHOW, K. Authorship Attribution for Forensic Investigation with Thousands of Authors. ICT Systems Security and Privacy Protection, [s. l.], p.339-350, 2014. DOI http://dx.doi.org/10.1007/978-3-642-55415-5_28.
ZHENG, R.; LI, J.; CHEN, H.; HUANG, Z. A framework for authorship identification of online messages: Writing-style features and classification techniques. Journal of the American Society for Information Science and Technology, [s. l.], v. 57, n. 3, p. 378-393, 2006. DOI http://dx.doi.org/10.1002/asi.20316
XYLOGIANNOPOULOS, K.; KARAMPELAS, P.; ALHAJJ, R. Text mining for plagiarism detection: multivariate pattern detection for recognition of text similarities. In: IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2018., Barcelona. Anais [...]. Barcelona:ASONAM, 2018. p. 938-945. DOI http://dx.doi.org/10.1109/asonam.2018.8508265.
Downloads
Published
How to Cite
Issue
Section
License
A revista se reserva o direito de efetuar, nos originais, alterações de ordem normativa, ortográfica e gramatical, com vistas a manter o padrão culto da língua e a credibilidade do veículo. Respeitará, no entanto, o estilo de escrever dos autores. Alterações, correções ou sugestões de ordem conceitual serão encaminhadas aos autores, quando necessário.
O conteúdo dos textos e a citação e uso de imagens submetidas são de inteira responsabilidade dos autores.
Em todas as citações posteriores, deverá ser consignada a fonte original de publicação, no caso a Informação & Informação.