Re-factoring is just a big word covering all the things related to making changes in an existing project.
The first thing to determine is if it's worthwhile to do some re-factoring at all, instead of rewriting everything from scratch.
If you do not need any backward compatibility, sure you can probably do that, it's probably a good idea.
If you need to do a transition from an existing set of data, or if the project you have to work on is already in use, then you cannot really afford to start from scratch. At the very minimum you need to be compatible with previous file formats, and you probably need to give the user some of the things they were used to have in the previous project. To do that you need some method, and you need to write documentation too - except if the previous project was entirely documented, but I doubt it's the case else you would not want to start from scratch, isn't it ? -.
First, there is a high probability that a large part of the project you need to re-factor is not in use anymore. Dead code, half finished functions, remnants from a previous attempt at re-factoring, you name it.
You need to put in place a quite strict method to be able to handle that, else you will most probably drive straight into the wall. Re-factoring is not for the weak and the impatient. It's a bit like forensic analysis. You need good tools, some intuition, the whole backed by good method and programming practices. And probably also some political skills to tell the management why it's important to spend some time now on some few things to avoid problems later.
What you will need:
in the knowpeople who worked on the system before, to get you started.
You will find some suggestions for these tools on the recommended tools page.
Personally I think that file formats is the first thing to analyze, document, and eventually change them to something better 1).
The reason is simple: Most applications are data driven in a way or another. So if you have the old system still running, you probably have loads of data available, in various formats, in various places. These data can be loaded from disk, from the network, … it does not really matter. The important fact is that these data need to be loaded, and/or saved. This is done by some code, which handle the eventual various versions of the file format. This is a very good place to start documenting the system and find out what is used or not.
Sometime you will find out that you cannot locate any data for a particular set of loading/saving/streaming code. It either means that the code was never used, or is just a remnant of the past. You can generally figure out by checking in the source control system 2) to see how long this code as been present in the codebase.
If you can locate data, you need to check the time stamps -if available-, to figure out if these data are just legacy things laying around, or if they are still in use.