In this article you will learn how to split large .tsv (and not only) files so you can import them to MS Office, Open Office or whatever you need using a small and neat (paid) program called EmEditor.

Here is the scenario - I have been booking.com affiliate partner for many years already, I remember few years ago I dreamt I wish I would programmatically access to all of their hotels (with metadata) , so I could import those data to some other web app and reuse data I want in manner I want to.

Recently I noticed on my Booking Central there is a new option - Data download, and here it was - All I dreamt for a couple of years ago. Booking.com offered me to download their hotel data in zipped .tsv format. Unarchived my file was about 1GB large.  I tried to import it to both Open Office (Libre Office) and MS Excel, but both programs stopped working and couldn't handle my request. For a moment I felt desperate - I have the data but I can't do much of it.

Booking.com hotel data sets

Booking.com hotel data sets

Now I remembered a few years ago I was building a geospatial web application using data from geonames.org, and back then I was using some tools to split large files into smaller so I could import them to Excel, adjust them how I want and then import data to Drupal using Feeds import.

Unfortunately I couldn't remember what tools I used exactly then, so I started searching on Google, and quickly found tool called EmEditor.


EmEditor is a fast, lightweight, yet extensible, easy-to-use text editor for Windows. Both native 64-bit and 32-bit builds are available!
 

Now EmEditor is not for free (it costs about $40) but it does offer Free Trial version (fully functional) for 30 days.So I downloaded EmEditor, opened my large almost 1GB size .tsv file on it, and stuck for a moment, I couldn't actually find where is that function to split that file. After couple of minutes of searching I finally found this option under Tools - > Split/Combine

EmEditor File Splitter

EmEditor File Splitter

The Split Current Document to Several Files command allows you to split the current document into several files either every user-specified number of lines, or before every bookmarked line. It also allows you to specify a header and/or footer to each separated file. 

It actually took very little time (less than 2 minutes) to split my 1gb file in six smaller files. And after split was done I opened split files in MS Excel, adjusted filters and was ready to develop further my application for importing data in Drupal.

Try EmEditor.

Hope this helps!