A few years ago, migrating content to a new site wasn’t a huge deal. Most publishers had a limited amount of content on their site and it was primarily text. Today, that same site may house thousands of files including video, images and slideshows.

Many publishers have existing staff or temps manually migrate stories into the new CMS. Others turn to automated migration systems such as Vamosa and Indigen Vector but that can be pricey, depending on how much content is involved (prices range wildly but think $30,00 for a large job). Either way, it can be a complicated, frustrating task. “There’s not a whole lot that’s positive about migrations until they’re done,” says developer Randy Funke.

Here are five rules to keep in mind when migrating content to a new CMS.

1. Understand What Type of Content You Have  
The first step is to take inventory of what kind of files you have (text, HTML, PDF, etc.) as well as related meta data.

“The big question on every development team’s mind is what is the preferred format of the exported file?” says Funke. “File preference is the primary factor that determines the nature of the migration. It could be in XML, or SQL or it could be raw HTML (which is a complete nightmare). The way things get mapped depends completely on the file type. You’re a slave to that file.”

2. Map It Out
Think of the CMS as a giant Excel spreadsheet that divvies up content by type as well as title field, author field, date, category, topic, subtopic, tags, metadata, keywords, body copy and related articles. The trick to migrating is creating relationships among the tables and databases.

Field mapping is the key to how your CMS accesses data. “When you have content tagged to different areas of your site, it becomes infinitely complicated very quickly,” says Funke. “Each CMS does their sequence ID a completely different way.”

3. Clean Up the Content
Make sure the coding is updated and compatible, particularly with character sets. Moving to a new CMS may require lots of content clean-up to both optimize for search and work correctly with a new system. “The previous class of content management systems frequently handled content by storing it in database tables,” says Joe Bachana, president of Web firm DPCI. “In that case you could do straight content migration. The challenge is, with sites where HTML was created, you need to create scripts to run conversions to the new system. Publishers especially have to be worried about SEO. People also have to make sure the quality of content is there so there are no weird styling issues that make the site look like a pasted-together ransom note.”

Make sure you re-link your media or import it into the same kind of directory structure. “A lot of people have bookmarks to an old site—this is the icing on the cake of why migrations are hell—and after the data gets migrated, the lovely tasks of access rewrite rules come into play,” says Funke. “Old bookmarks now have to redirect to new ones. It’s not like the ID numbers in the CMS will be in sync with each other. On the new one, an article might be #10,000, while on the old one it was article #3,658. If you can’t construct rules that will allow the old pages to find the articles on the new CMS, you have to manually import those articles.”

4. Migrate the Best Content—Dump the Rest
Let old content go unless it’s premium content. “Take a look at evergreen content for migration,” says Bachana. “If you have a high degree of complexity with migration and cleanup, look only at the best quality content. See what content is highly trafficked and maybe throw out the stuff on your long tail. If people looked at migration as a process to get the best quality materials out, then they wouldn’t have to dig deep into that other set of assets.”

5. Choose a System You Can Walk Away From
It may sound strange but part of choosing a new CMS is selecting one you can walk away from fairly easily in the future. “Migration has a lot to do with where you’re coming from as opposed to where you’re going,” says Bachana. “Think about how it handles content today in case you have to migrate away tomorrow. Understand what structure that content is in. If it’s in XML, you’ll be in pretty good shape.”