Resume parser in Java

I want to parse a resume to get different titles and content, which includes bullets, paragraphs, urls. I have the resume in .doc/.docx format. Research so far has resulted in

1.building an xml file from the .doc file and then
2. build an xml parser using JDOM.

Is there any other approach or a better way to do this? some algorithm that would help identify structures in resume?

Solution

look like you are in right direction. Simple approach is : Once you identify information and moved further, you just need to transverse based on +/- steps with calculated spaces, and identify results.

I am sure you are using NLP methodology which can help you to get data with proximity and then you can remove noise based on your experience.

or simple go and get some already build up. I recomend you RChilli CV Parsing or others like hireability or sovren and discuss your need. I am sure you get some information

thanks -K

Simple way templating multiline strings in java code
Unsafe or unchecked operations for ArrayList
Consistent time zone for date comparison
How to convert a .nfo file to .fff file
Bash command to check if Oracle or OpenJDK java version is installed on Linux
Between Runtime and compile time jdk version , which one can be greater
I get an error: An unexpected error occurred while trying to open file hola.jar
Issue to list names of files from `resources` directory. (in JAR/EXE application)
space is not allowed after parameter prefix ':' in SpEL support in Spring Data JPA @Query definitions
Spring Boot Fails to Start after First Boot
Null exception @Service in a spring @RestController using constructor injection
How to print sequence of characters according to the number next to it?
Idiomatic way to use for-each loop given an iterator?
java jdb does not display code lines in step output
Set Chrome's language using Selenium ChromeDriver
a collection data structure to keep items sorted
In Intellij IDEA 14.1.4: Cannot run program "C:/Program Files (x86)/Java/jdk1.8.0_45/bin/java"
Using JDK that is bundled inside Android Studio as JAVA_HOME on Mac
Gatling Java request to be called only once to retrieve login information for other requests that are executed multiple times
How to fix Eclipse autocomplete not working
Trying to send a message to a queue on weblogic
logging.file.max-history is not working in Spring Boot
How can I find the most frequent word in a huge amount of words (eg. 900000)
Swapping positions of duplicate values in array (Java)
Java - sending POST
Dynamically adding JTable to JScrollPane
How can I change the database table column order?
Efficient Ways To Display Messages In RecyclerView
How to get filtered results based on two columns from a single table using Specification and Criteria Builder in Spring Boot
Rabbit mq error: Getting Exception in thread "main" java.io.IOException Caused by: com.rabbitmq.client.ShutdownSignalException