Information Extraction from CV Documents Using a Hybrid Natural Language Processing and Rule-Based Approach
Ekstraksi Informasi dari Dokumen CV Menggunakan Pendekatan Hybrid Natural Language Processing dan Rule-Based
DOI:
https://doi.org/10.21070/ups.9859Keywords:
Natural Language Processing, Applicant Tracking System, Information Extraction, Resume Parsing, Hybrid Approach, Rule-Based SystemAbstract
Digital transformation in recruitment mandates efficient Applicant Tracking Systems (ATS), yet variable, unstructured CV formats remain a primary obstacle. Traditional rule-based methods often lack flexibility, while Deep Learning models incur high computational costs. This study presents a hybrid CV information extraction system designed to balance accuracy and efficiency.
The methodology integrates Rule-Based algorithms (Regular Expressions) for factual data with Natural Language Processing (NLP)—specifically the spaCy library—for analyzing competencies and experience. Empirical results demonstrate robust performance, achieving a Precision of 91.3%, Recall of 87.5%, and an F1-Score of 89.4%. The system attained an overall accuracy of 80.8% with a processing time of under 2 seconds per document, proving that hybrid methods effectively manage CV complexity without sacrificing computational speed.
Downloads
References
M. Saatci, R. Kaya, and R. Ünlü, “Resume Screening With Natural Language Processing (NLP),” Alphanumeric J., vol. 12, no. 2, pp. 121–140, 2024, doi: 10.17093/alphanumeric.1536577.
K. Chhabra and D. Vashistha, “Applicant Tracking System ( ATS ) Department of Computer Science & Engineering and Information Technology Jaypee University of Information Technology ,” vol. 173234, no. 201293, 2024.
Harshitha R and M. Veena, “A SURVEY ON RESUME ANALYSIS USING NLP,” www.irjmets.com @International Res. J. Mod. Eng., 1030, [Online]. Available: www.irjmets.com
N. Nair, S. Pavithra, and V. Vismaya, “Resume parser using NLP,” Int. J. Adv. Res. Comput. Commun. Eng., vol. 13, no. 9, pp. 39–42, 2024, doi: 10.17148/IJARCCE.2024.13905.
Y. Sari, M. F. Hassan, and N. Zamin, “Rule-based pattern extractor and Named Entity Recognition: A hybrid approach,” Proc. 2010 Int. Symp. Inf. Technol. - Eng. Technol. ITSim’10, vol. 2, pp. 563–568, 2010, doi: 10.1109/ITSIM.2010.5561392.
B. Nirali*, J. Gandhi, and D. K. Singh, “NLP based Extraction of Relevant Resume using Machine Learning,” Int. J. Innov. Technol. Explor. Eng., vol. 9, no. 7, pp. 13–17, 2020, doi: 10.35940/ijitee.f4078.059720.
M. B. Gunjal, T. P. Thorat, K. S. Muttha, V. C. Shete, and P. D. Sagar, “A Review Paper on Resume Parser Using AI,” 2025.
“Analysis & Shortcomings of E-Recruitment Systems: Towards a Semantics-based Approach Addressing Knowledge Incompleteness and Limited Domain Coverage 1.”
S. Pudasaini, S. Shakya, S. Lamichhane, S. Adhikari, A. Tamang, and S. Adhikari, “Application of NLP for Information Extraction from Unstructured Documents,” Lect. Notes Networks Syst., vol. 209, pp. 695–704, 2022, doi: 10.1007/978-981-16-2126-0_54.
D. T. Tolciu, C. Săcărea, and C. Matei, “Analysis of patterns and similarities in service tickets using natural language processing,” J. Commun. Softw. Syst., vol. 17, no. 1, pp. 29–35, Feb. 2021, doi: 10.24138/JCOMSS.V17I1.1024.
R. Panwar, “AI-ENABLED INTERVIEW ANALYSIS: UNVEILING INSIGHTS AND ENHANCING DECISION-MAKING IN HUMAN RESOURCE MANAGEMENT,” INTERANTIONAL J. Sci. Res. Eng. Manag., vol. 07, no. 05, Jun. 2023, doi: 10.55041/IJSREM24357.
A. D. Wibawa, A. M. Amri, A. Mas, and S. Iman, “Text Mining for Employee Candidates Automatic Profiling Based on Application Documents,” Emit. Int. J. Eng. Technol., pp. 47–62, Apr. 2022, doi: 10.24003/emitter.v10i1.679.
R. Sonbol, G. Rebdawi, and N. Ghneim, “Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000. The Use of NLP-Based Text Representation Techniques to Support Requirement Engineering Tasks: A Systematic Mapping Review”, doi: 10.1109/ACCESS.2017.DOI.
E. Omodei et al., “OPEN ACCESS EDITED BY REVIEWED BY Natural language processing for humanitarian action: Opportunities, challenges, and the path toward humanitarian NLP.” [Online]. Available: https://thedeep.io
M. F. Ghozali and A. Eviyanti, “Sistem Pakar Diagnosa Dini Penyakit Leukimia Dengan Metode ‘Certainty Factor,’” Kinetik, vol. 1, no. 3, p. 135, 2016, doi: 10.22219/kinetik.v1i3.122.
Downloads
Additional Files
Posted
License
Copyright (c) 2026 UMSIDA Preprints Server

This work is licensed under a Creative Commons Attribution 4.0 International License.
