I wanted to import the million song dataset in SQL Server on Linux. There’s a github repo that has the SQL to allow you to use this with the graph database features. However, it’s built for Windows.
Linux is a slightly different beast. Once I started down this path, I had memories of working on SunOS in college, messing with permissions and moving files.
I run Ubuntu in VMWare, so I first downloaded the files to my Documents folder. That’s pretty easy. However, once there, the mssql user can’t read them. Rather than mess with permissions for my home, I decided to move these to a location where the mssql user could read them.
First, I need to use mv to move the files. However, the default location for SQL Server (/var/opt/mssql) doesn’t let me drop files in there. Instead, I need to sudo the mv.
sudo mv unique_tracks.txt /var/opt/mssql/unique_tracks.txt
I repeated this for each file.
However, I still had permissions errors. Files have their own permissions in Linux, so I needed to alter those. I decided to use chown since these are temp files the SQL Server will use and once imported, I’ll delete them.
chown mssql unique_tracks.txt
From here, I could easily run the OPENROWSET commands and get the data loaded. Now to play around with a graph.