Changes to Data License, new export options

by teo

Big changes regarding the usage of Discogs data!

1) The Discogs Data License for Artist, Label, and Release data has been changed to a Public Domain license
and is mentioned on the [url=/help/api]API Document[/url]. This means that there are no restrictions on your usage of the data.

2) The “Export to Excel” option has been removed and replaced with a more versatile [url=/users/export]export function[/url]. This new export page allows you to download your Collection, Wantlist, or Contributions in CSV or XML format. The XML format will include complete release details. Exports of your Collection will also contain the Folder field, which is a much asked-for feature. The new export page is also linked in My Discogs.

33 comments about “Changes to Data License, new export options
  • y-1 10 years ago
    Very nice teo, thanks.

    I noticed just one mistake when downloading in CSV format: If a record has more than one cat# (usually when it's on more than one label), these cat#'s should be parenthesized in order not to display them in two separate columns, just as it's the practice for several labels on one release.

    Example

    LASH 21, 886 880-7,Faith No More,Epic,"Slash Records, London Records","7"", Single",4,1990,1174663,http://www.discogs.com/release/1174663 ,Alternative Rock / Metal

    Corrected example

    "LASH 21, 886 880-7",Faith No More,Epic,"Slash Records, London Records","7"", Single",4,1990,1174663,http://www.discogs.com/release/1174663 ,Alternative Rock / Metal

    Cheers
  • Julz72 10 years ago
    i wanted to download my collection in csv and xml but buth zip's contained a csv file....
  • arT2 10 years ago
    many many thanks teo for this update!
  • little_alien 10 years ago
    Good news :)
  • 303 10 years ago
    well.. any recommendable viewer for see something more than code lines?

  • 74 10 years ago
    How do you convert a csv into a normal excel sheet
    i mean like it was before?
  • md 10 years ago
    [quote=74]How do you convert a csv into a normal excel sheet[/quote]
    Open it in excel and save as an xls file.


    It's useful to have access to a download of contributions, and folders in the collection, but apart from the annoyance of having to go through several extra steps and waiting longer to get the same information, there are numerous bugs in the downloaded data:

    Insync vs Mysteron²* shown as Insync vs Mysteron²*
    (K-RAA-K)³ shown as (K-RAA-K)³
    L'écurie shown as L'écurie
    H&M shown as H
    A&M Records (UK) shown as A Records (UK)
    Ouvertüre »Coriolan« shown as Ouvertüre »Coriolan«
    Sähkö Recordings showing as Sähkö Recordings

    etc
    etc
    etc
  • 74 10 years ago

    [quote=md]Open it in excel and save as an xls file. [/quote]

    doesn't work

    i want it as before
    artist
    label
    title

    in a seperate field
  • nik 10 years ago
    [quote=Julz72]i wanted to download my collection in csv and xml but buth zip's contained a csv file....[/quote]

    This works ok for me, can you double check you downloaded the correct files please?

    [quote=303]any recommendable viewer for see something more than code lines? [/quote]
    [quote=74]How do you convert a csv into a normal excel sheet [/quote]

    Here's what I do:

    * I use OpenOffice instead of Microsoft Office, I prefer Open Office, it works on many different operating systems, can read and save many different file types, and its free! Get it at http://www.openoffice.org/

    * Download the csv version of your Collection, Wantlist, or Contributions.

    * Save the zip to your desktop. Extract the file, then right click and open it with OpenOffice - it will give you a preview of what you will see. You need to set the character set to Unicode (UTF-8), and it should automatically be separated by comma.

    * Press 'OK' and it will open up. You can then view it.

    * If you want to save it, use 'Save As', and you can select from many different formats, including several versions of Microsoft Excel.

    I think the process for MS Excel for opening the CSV file and saving as an .xls file will be similar.
  • nik 10 years ago
    [quote=md]there are numerous bugs in the downloaded data:

    Insync vs Mysteron²* shown as Insync vs Mysteron²*
    (K-RAA-K)³ shown as (K-RAA-K)³
    L'écurie shown as L'écurie
    H&M shown as H
    A&M Records (UK) shown as A Records (UK)
    Ouvertüre »Coriolan« shown as Ouvertüre »Coriolan«
    Sähkö Recordings showing as Sähkö Recordings

    etc [/quote]

    I can't see those exact releases ATM md, but did you open the csv file as Unicode (UTF-8)?

    4th & Broadway displays OK for me doing this.
  • taalem 10 years ago
    i have the exact same problem as md.

    [quote=nik]I can't see those exact releases ATM md, but did you open the csv file as Unicode (UTF-8)? [/quote]
    excel won't ask you if the csv file is utf-8 or not.
    see http://members.chello.at/robert.graf/CSV/

    this should be fixed asap.
  • md 10 years ago
    I've never seen any options on opening a file as Unicode. I just double clicked on the file and it opened.

    I expect 4th & Broadway displays OK as there's a space between around the "&". The ommission of characters seems to happen when they are immediately adjacent to the "&" in the artist, title or whatever.
  • nik 10 years ago
    Ok, I have tested again, and I see a problem with [r=914176] - label "A&M Records (UK)" only comes out as "A Records (UK)", so that is a bug I think:

    [quote=md]H&M shown as H
    A&M Records (UK) shown as A Records (UK) [/quote]

    These others look like the Unicode issue:

    [quote=md]Insync vs Mysteron²* shown as Insync vs Mysteron²*
    (K-RAA-K)³ shown as (K-RAA-K)³
    L'écurie shown as L'écurie
    Ouvertüre »Coriolan« shown as Ouvertüre »Coriolan«
    Sähkö Recordings showing as Sähkö Recordings [/quote]

    I am not sure what can be done, Discogs data is Unicode now. I don't want to be blasé and say MS Office sucks, but perhaps that is the reason here? taalem's link would point to Excel 2003 (what version are you guys using?) having this issue, and it appears to be a problem with the MS Office software, not the csv file. Whatever the cause of the issue, we'll need to try to find a solution.
  • nik 10 years ago
    Here more about the Excel Unicode issue http://base0.net/archives/197-Pet-Peeve-2-Microsoft-Excel-and-Unicode.html

    Apparently, if you rename the file .txt then do an import in Excel, you can select Unicode at some point in the procedure. Not ideal, but it sounds like it'll work. You can then save it as a xls or whatever document.
  • y-1 10 years ago
    I also got some of md's problems, like µ-Ziq displayed as µ-Ziq or Sinéad O'Connor as Sinéad O'Connor.
    No problems with R & S Records and the likes however.

    [quote=74]How do you convert a csv into a normal excel sheet
    i mean like it was before?[/quote]

    I had to experiment a little bit, but it turned out to be easy. The only problem is I got a German language Excel... maybe somebody can translate this for the English version. I'll try to give an idea in brackets.
    So here's what I do :)

    - select first column
    - execute 'Text in Spalten...' (Text in columns...?) from the drop-down menu 'Daten' (Data?)
    - Step 1: select 'getrennt' (separated?)
    - Step 2: select 'Trennzeichen: Komma' (separator: comma?) and 'Texterkennungszeichen: " ' (text identificator: " ?)
    - Step 3: select 'Datenformat der Spalten: Standard' (data format of columns: default ?)
    - finish

    There you are.
  • MetallicRaver 10 years ago
    Great update!

    [quote=303]well.. any recommendable viewer for see something more than code lines? [/quote]
    That's what I'm looking for, too.
  • md 10 years ago
    All the text that is showing incorrectly above can be shown fine on Excel 2003 (or indeed Excel 2000 which is what I'm using) with no problems. I've just gone through the even more convoluted and long winded route of opening the file via Open Office Calc and copying back into Excel 2000 and all the text shows fine without the errors (except for the ampersand errors which are still there).

    It just can't be displayed as such when opening from the file Discogs uploads for export. And really any app that can open a .csv file ought to be able to show the data correctly if it's there.
  • Julz72 10 years ago
    nik,

    i got this when i go to the page for downloading the data:

    Status Date Processed Type
    Completed 2008-01-30 02:10:42 Wantlist (XML) Download
    Completed 2008-01-29 23:40:09 Collection (CSV) Download
    Completed 2008-01-29 23:40:08 Collection (XML) Download
    Completed 2008-01-29 23:40:02 Contributions (XML) Donwnload

    and both the zips of Collection (CSV) and Collection (XML) contain a csv file. I'm absolutely sure.

    [quote=MetallicRaver]303
    well.. any recommendable viewer for see something more than code lines?
    [/quote]

    i would like something like that as well, i mean just a viewer, not that it has to be converted first...

  • md 10 years ago
    [quote=md]It just can't be displayed as such when opening from the file Discogs uploads for export. And really any app that can open a .csv file ought to be able to show the data correctly if it's there.[/quote]
    btw the previous xls download to Excel 2000 did NOT contain these Unicode errors. Excel hasn't changed in the past couple of days, but Discogs has, so it's clear where the fix needs to be in order for the new functionality to work at least as well as it did before the changes.
  • nickacid 10 years ago
    I've requested a download of 'contributions' accidentally - is there any way to cancel this?
    Thanks.
  • teo 10 years ago
    Thanks for all of the feedback. I wasn't aware that Excel had such poor unicode support for csv files. Discogs is creating them correctly; it's just that Excel does not read them correctly. But I've found a solution and it will be live later today.

    [quote=y-1]I noticed just one mistake when downloading in CSV format: If a record has more than one cat# (usually when it's on more than one label), these cat#'s should be parenthesized in order not to display them in two separate columns, just as it's the practice for several labels on one release.

    Example

    LASH 21, 886 880-7,Faith No More,Epic,"Slash Records, London Records","7"", Single",4,1990,1174663,http://www.discogs.com/release/1174663 ,Alternative Rock / Metal [/quote]
    Are you sure? I just exported this one to csv and the catalog# field has quotes.


    [quote=nickacid]I've requested a download of 'contributions' accidentally - is there any way to cancel this?[/quote]
    You can just ignore it. The files will get deleted automatically after a week.

  • y-1 10 years ago
    [quote=teo]Are you sure? I just exported this one to csv and the catalog# field has quotes.[/quote]

    Of course I'm sure, I copied & pasted the example from my download, done very shortly after the feature was made available (Completed 2008-01-29 13:55:13 Collection (CSV). There hasn't been any update since, has there?
  • teo 10 years ago
    y-1, no, there hasn't been any changes to that between the time you posted and I replied. I'm not sure why your's is missing quotes and mine has it. What happens if you try again?
  • teo 10 years ago
    There's now an Excel option in the format dropdown and this works correctly with Unicode data.

    Please post if there are any other issues. thanks
  • md 10 years ago
    Cool, that works now.

    The ampersand bug is still there though.
  • y-1 10 years ago
    [quote=teo]What happens if you try again?[/quote]

    Amazing: I tried the same again, and now not only multiple cat#'s are grouped together, but also all the strange characters are gone, plus I can save the procedure described above, as the contents are automatically grouped in columns!
    So, all problems solved, except for the ampersand bug, which I also found an example of...
    Btw, I get the same result when exporting in CSV format as in XLS format.

    I can't tell you what went wrong the first time, as I'm sure I did the same thing as now, but luckily it works now, and I really like this new feature - thanks again!
  • nicreve 10 years ago
    Although I like the new export, I've made two attempts at exporting my collection (CSV and Excel formats) and both of them resulted in a 'Page not found' when I clicked the download link, so if this could be fixed, I'd very much appreciate it. :)
  • Scoz 10 years ago
    I tried the xml download for a laugh and once it was I clicked the link and got the oops you've followed a bad link page.

    right click and save as didn't work either.
  • teo 10 years ago
    nicreve, Scoz, I've found that cause of that and fixed it. You should be able to download those files now.
  • 74 10 years ago
    I'm sorry to say
    but i think we should also have the old .xls option
  • 74 10 years ago
    oeps
    sorry
    was too quick before checking

  • little_alien 10 years ago
    Is it me or does the XML dump not include a declaration tag like this? [?xml version="1.0" encoding="utf-8"?] (with the ] being a > etc. of course, but the forum filters that out) How will a parser know which character set to use?

    I haven't investigated this thoroughly yet, but at least IE7 doesn't recognize the unicode characters in my downloaded document. I don't have a decent XML viewer at hand at the moment though.
  • mjb 10 years ago
    little_alien, an XML parser will assume UTF-8 if there's no encoding declaration and no BOM. UTF-8 is what the API is producing, so it shouldn't be a problem.

    If you're loading the XML directly in IE7 (not via script), then it might be using heuristic analysis to guess at the character encoding, depending on your settings.

    Can you provide a release ID that demonstrates the problem?

    Also make a note of your View > Encoding settings (if any).