Preprint has been published in a journal as an article
Preprint / Version 1

Information Extraction from CV Documents Using a Hybrid Natural Language Processing and Rule-Based Approach

Ekstraksi Informasi dari Dokumen CV Menggunakan Pendekatan Hybrid Natural Language Processing dan Rule-Based

##article.authors##

DOI:

https://doi.org/10.21070/ups.9859

Keywords:

Natural Language Processing, Applicant Tracking System, Information Extraction, Resume Parsing, Hybrid Approach, Rule-Based System

Abstract

Digital transformation in recruitment mandates efficient Applicant Tracking Systems (ATS), yet variable, unstructured CV formats remain a primary obstacle. Traditional rule-based methods often lack flexibility, while Deep Learning models incur high computational costs. This study presents a hybrid CV information extraction system designed to balance accuracy and efficiency.

The methodology integrates Rule-Based algorithms (Regular Expressions) for factual data with Natural Language Processing (NLP)—specifically the spaCy library—for analyzing competencies and experience. Empirical results demonstrate robust performance, achieving a Precision of 91.3%, Recall of 87.5%, and an F1-Score of 89.4%. The system attained an overall accuracy of 80.8% with a processing time of under 2 seconds per document, proving that hybrid methods effectively manage CV complexity without sacrificing computational speed.

Downloads

Download data is not yet available.

References

M. Saatci, R. Kaya, and R. Ünlü, “Resume Screening With Natural Language Processing (NLP),” Alphanumeric J., vol. 12, no. 2, pp. 121–140, 2024, doi: 10.17093/alphanumeric.1536577.

K. Chhabra and D. Vashistha, “Applicant Tracking System ( ATS ) Department of Computer Science & Engineering and Information Technology Jaypee University of Information Technology ,” vol. 173234, no. 201293, 2024.

Harshitha R and M. Veena, “A SURVEY ON RESUME ANALYSIS USING NLP,” www.irjmets.com @International Res. J. Mod. Eng., 1030, [Online]. Available: www.irjmets.com

N. Nair, S. Pavithra, and V. Vismaya, “Resume parser using NLP,” Int. J. Adv. Res. Comput. Commun. Eng., vol. 13, no. 9, pp. 39–42, 2024, doi: 10.17148/IJARCCE.2024.13905.

Y. Sari, M. F. Hassan, and N. Zamin, “Rule-based pattern extractor and Named Entity Recognition: A hybrid approach,” Proc. 2010 Int. Symp. Inf. Technol. - Eng. Technol. ITSim’10, vol. 2, pp. 563–568, 2010, doi: 10.1109/ITSIM.2010.5561392.

B. Nirali*, J. Gandhi, and D. K. Singh, “NLP based Extraction of Relevant Resume using Machine Learning,” Int. J. Innov. Technol. Explor. Eng., vol. 9, no. 7, pp. 13–17, 2020, doi: 10.35940/ijitee.f4078.059720.

M. B. Gunjal, T. P. Thorat, K. S. Muttha, V. C. Shete, and P. D. Sagar, “A Review Paper on Resume Parser Using AI,” 2025.

“Analysis & Shortcomings of E-Recruitment Systems: Towards a Semantics-based Approach Addressing Knowledge Incompleteness and Limited Domain Coverage 1.”

S. Pudasaini, S. Shakya, S. Lamichhane, S. Adhikari, A. Tamang, and S. Adhikari, “Application of NLP for Information Extraction from Unstructured Documents,” Lect. Notes Networks Syst., vol. 209, pp. 695–704, 2022, doi: 10.1007/978-981-16-2126-0_54.

D. T. Tolciu, C. Săcărea, and C. Matei, “Analysis of patterns and similarities in service tickets using natural language processing,” J. Commun. Softw. Syst., vol. 17, no. 1, pp. 29–35, Feb. 2021, doi: 10.24138/JCOMSS.V17I1.1024.

R. Panwar, “AI-ENABLED INTERVIEW ANALYSIS: UNVEILING INSIGHTS AND ENHANCING DECISION-MAKING IN HUMAN RESOURCE MANAGEMENT,” INTERANTIONAL J. Sci. Res. Eng. Manag., vol. 07, no. 05, Jun. 2023, doi: 10.55041/IJSREM24357.

A. D. Wibawa, A. M. Amri, A. Mas, and S. Iman, “Text Mining for Employee Candidates Automatic Profiling Based on Application Documents,” Emit. Int. J. Eng. Technol., pp. 47–62, Apr. 2022, doi: 10.24003/emitter.v10i1.679.

R. Sonbol, G. Rebdawi, and N. Ghneim, “Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000. The Use of NLP-Based Text Representation Techniques to Support Requirement Engineering Tasks: A Systematic Mapping Review”, doi: 10.1109/ACCESS.2017.DOI.

E. Omodei et al., “OPEN ACCESS EDITED BY REVIEWED BY Natural language processing for humanitarian action: Opportunities, challenges, and the path toward humanitarian NLP.” [Online]. Available: https://thedeep.io

M. F. Ghozali and A. Eviyanti, “Sistem Pakar Diagnosa Dini Penyakit Leukimia Dengan Metode ‘Certainty Factor,’” Kinetik, vol. 1, no. 3, p. 135, 2016, doi: 10.22219/kinetik.v1i3.122.

Posted

2026-01-27