Python json save as csv

SDohmen · September 15, 2021, 11:21am

Hello,

For one of my work related api's i am getting a JSON response back from a API call that i make. In this JSON response all product info is located and i need to save parts of it in a database.

The problem that i have is that i can get the data to a CSV but because for 1 of the fields is also an array this is not displayed correctly.

The JSON response is as follows (shortened down quite alot):

{
   "success": true,
   "data": [
      {
         "idproduct": 12609836,
         "idvatgroup": 10059,
         "idsupplier": 42582,
         "productcode": "00000A00186918",
         "name": "A00186918 - 00000A00186918 - Partytent 3x3 met 2 Zijwanden Grijs",
         "price": 0,
         "fixedstockprice": 0,
         "productcode_supplier": "0",
         "deliverytime": 99,
         "description": "Partytent 3x3 met 2 Zijwanden Grijs",
         "barcode": "",
         "unlimitedstock": false,
         "assembled": false,
         "type": "normal",
         "weight": 0,
         "length": 146,
         "width": 20,
         "height": 20,
         "minimum_purchase_quantity": 0,
         "purchase_in_quantities_of": 0,
         "hs_code": "",
         "country_of_origin": "",
         "active": true,
         "comment_count": 0,
         "analysis_abc_classification": null,
         "analysis_pick_amount_per_day": null,
         "tags": [],
         "productfields": [],
         "images": [],
         "stock": [
            {
               "idwarehouse": 4861,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4863,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4862,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4865,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4870,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4871,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4872,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4873,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 5397,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            }
         ],
         "pricelists": []
      },
      {
         "idproduct": 12609870,
         "idvatgroup": 10059,
         "idsupplier": 42582,
         "productcode": "00000A00189619",
         "name": "A00189619 - 00000A00189619 - Partytent 3x3 met 4 Zijwanden Grijs",
         "price": 0,
         "fixedstockprice": 0,
         "productcode_supplier": "0",
         "deliverytime": 99,
         "description": "Partytent 3x3 met 4 Zijwanden Grijs",
         "barcode": "",
         "unlimitedstock": false,
         "assembled": false,
         "type": "normal",
         "weight": 0,
         "length": 146,
         "width": 20,
         "height": 20,
         "minimum_purchase_quantity": 0,
         "purchase_in_quantities_of": 0,
         "hs_code": "",
         "country_of_origin": "",
         "active": true,
         "comment_count": 0,
         "analysis_abc_classification": null,
         "analysis_pick_amount_per_day": null,
         "tags": [],
         "productfields": [],
         "images": [],
         "stock": [
            {
               "idwarehouse": 4861,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4863,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4862,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4865,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4870,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4871,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4872,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4873,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 5397,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            }
         ],
         "pricelists": []
      },
      {
         "idproduct": 10791362,
         "idvatgroup": 9520,
         "idsupplier": 42582,
         "productcode": "V0718037867724",
         "name": "WD - Western Digital My Passport Go Portable SSD Drive, 1TB, USB 3.0, cobalt blue",
         "price": 0,
         "fixedstockprice": 0,
         "productcode_supplier": "-NA-",
         "deliverytime": 99,
         "description": null,
         "barcode": "0718037867724",
         "unlimitedstock": false,
         "assembled": false,
         "type": "normal",
         "weight": null,
         "length": null,
         "width": null,
         "height": null,
         "minimum_purchase_quantity": 0,
         "purchase_in_quantities_of": 0,
         "hs_code": null,
         "country_of_origin": null,
         "active": true,
         "comment_count": 0,
         "analysis_abc_classification": null,
         "analysis_pick_amount_per_day": null,
         "tags": [],
         "productfields": [],
         "images": [],
         "stock": [
            {
               "idwarehouse": 4871,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4873,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            },
            {
               "idwarehouse": 4865,
               "stock": 0,
               "reserved": 0,
               "reservedbackorders": 0,
               "reservedpicklists": 0,
               "reservedallocations": 0,
               "freestock": 0
            }
         ],
         "pricelists": []
      }
   ]
}

Currently i have this working by taking this JSON output and transform it to XML and then use xmlstarlet to grab the items that i want. The problem with this, is that it takes quite a long time before the complete thing is processed so i want to rebuild it to python in the hopes that it will be alot faster (apart from getting the data itself, i cannot speed that up).

The code that i currently have is as follows:

import json

content = '{"success":true,"data":[{"idproduct":12609836,"idvatgroup":10059,"idsupplier":42582,"productcode":"00000P00189918","name":"P00189918 - 8719689225469 - Garden Royal Easy up Partytent 3x3 met 2 Zijwanden Donker Grijs","price":0,"fixedstockprice":0,"productcode_supplier":"0","deliverytime":99,"description":"Garden Royal Easy up Partytent 3x3 met 2 Zijwanden Donker Grijs","barcode":"","unlimitedstock":false,"assembled":false,"type":"normal","weight":0,"length":146,"width":20,"height":20,"minimum_purchase_quantity":0,"purchase_in_quantities_of":0,"hs_code":"","country_of_origin":"","active":true,"comment_count":0,"analysis_abc_classification":null,"analysis_pick_amount_per_day":null,"tags":[],"productfields":[],"images":[],"stock":[{"idwarehouse":4861,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4863,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4862,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4865,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4870,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4871,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4872,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4873,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":5397,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0}],"pricelists":[]},{"idproduct":12609870,"idvatgroup":10059,"idsupplier":42582,"productcode":"00000P00189919","name":"P00189919 - 8719689225476 - Garden Royal Easy up Partytent 3x3 met 4 Zijwanden Donker Grijs","price":0,"fixedstockprice":0,"productcode_supplier":"0","deliverytime":99,"description":"Garden Royal Easy up Partytent 3x3 met 4 Zijwanden Donker Grijs","barcode":"","unlimitedstock":false,"assembled":false,"type":"normal","weight":0,"length":146,"width":20,"height":20,"minimum_purchase_quantity":0,"purchase_in_quantities_of":0,"hs_code":"","country_of_origin":"","active":true,"comment_count":0,"analysis_abc_classification":null,"analysis_pick_amount_per_day":null,"tags":[],"productfields":[],"images":[],"stock":[{"idwarehouse":4861,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4863,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4862,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4865,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4870,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4871,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4872,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4873,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":5397,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0}],"pricelists":[]},{"idproduct":10791362,"idvatgroup":9520,"idsupplier":42582,"productcode":"E0718037867724","name":"WD - Western Digital My Passport Go Portable SSD Drive, 1TB, USB 3.0, cobalt blue","price":0,"fixedstockprice":0,"productcode_supplier":"-NA-","deliverytime":99,"description":null,"barcode":"0718037867724","unlimitedstock":false,"assembled":false,"type":"normal","weight":null,"length":null,"width":null,"height":null,"minimum_purchase_quantity":0,"purchase_in_quantities_of":0,"hs_code":null,"country_of_origin":null,"active":true,"comment_count":0,"analysis_abc_classification":null,"analysis_pick_amount_per_day":null,"tags":[],"productfields":[],"images":[],"stock":[{"idwarehouse":4871,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4873,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0},{"idwarehouse":4865,"stock":0,"reserved":0,"reservedbackorders":0,"reservedpicklists":0,"reservedallocations":0,"freestock":0}],"pricelists":[]}]}'

products = json.loads(content)

products = products["data"]

for product in products:
    print(product["idproduct"],";",product["productcode"],product["name"],product["stock"][1]["idwarehouse"])

The problem however is that with the 3rd product, the warehouse where it grabs the data from it not the same as the other products. This will create faulty results.

I tried different things but i cannot seem to figure out how i get the results as below:

12609836;10059;42582;00000A00186918;A00186918 - 00000A00186918 - Partytent 3x3 met 2 Zijwanden Grijs;0;0;0;99;;normal;0;0;0;;0;0;0;0;0;0;0;0;
12609870;10059;42582;00000A00189619;A00189619 - 00000A00189619 - Partytent 3x3 met 4 Zijwanden Grijs;0;0;0;99;;normal;0;0;0;;0;0;0;0;0;0;0;0;
10791362;9520;42582;V0718037867724;WD - Western Digital My Passport Go Portable SSD Drive, 1TB, USB 3.0, cobalt blue;0;0;-NA-;99;0718037867724;normal;;;;;0;;;;0;;0;;

Since the above results arent very clear i am posting the xmlstarlet code that i use to clearify:

xmlstarlet sel -T -t -m /searchresults/data -v "concat(idproduct,';', \
        idvatgroup,';', \
        idsupplier,';', \
        productcode,';', \
        name,';', \
        price,';', \
        fixedstockprice,';', \
        productcode_supplier,';', \
        deliverytime,';', \
        barcode,';', \
        type,';', \
        stock[idwarehouse="4861"]/freestock,';', \
        stock[idwarehouse="4861"]/stock,';', \
        stock[idwarehouse="4861"]/reserved,';', \
        stock[idwarehouse="4861"]/backorder,';', \
        stock[idwarehouse="4865"]/freestock,';', \
        stock[idwarehouse="4862"]/freestock,';', \
        stock[idwarehouse="4863"]/freestock,';', \
        stock[idwarehouse="4870"]/freestock,';', \
        stock[idwarehouse="4871"]/freestock,';', \
        stock[idwarehouse="4872"]/freestock,';', \
        stock[idwarehouse="4873"]/freestock,';', \
        stock[idwarehouse="5397"]/freestock,';', \
        stock[idwarehouse="6546"]/freestock
)"

Who can tell me how i can solve this issue? or point me in the right direction.

Neo · September 15, 2021, 11:40am

You can also consider doing this JSON to CSV (a standard library task) conversion in Ruby using a gem:

Example:

json2csv convert data/sample.json

Outputs CSV data.....

This kind of "standard task" is very common and so there are many lib functions to go from JSON to CSV. The above example is just one of many ways to do this. Normally I do these conversions in Ruby; but I'm confident there is a Python lib for this as well.

SDohmen · September 15, 2021, 12:43pm

Thank you for the answer and it looks interesting but i do not have ruby installed (and i am not really planning to do so because my complete lack of knowledge for ruby) but again it is interesting and a possible last resort.

I did some searches for python versions of these kinda tasks but i could not find a good one. That might be because the json i am feeding is not the finest. I might have to strip a part so i only keep the data structure and go from there. I tried this in a online conversion tool which goes from json to csv but here i noticed the problem with the stock data as the 3rd does not have the same warehouses coupled which in turn writes the data in the wrong columns.

Neo · September 15, 2021, 12:44pm

What OS are you running?

SDohmen · September 15, 2021, 12:56pm

I am running a Debian 11 install. I know i can install ruby but i would prefer to stick to the languages that i know (bash) or am learning (python) since i will most likely use this for other scripts also.

Azhrei · September 15, 2021, 10:45pm

I'm interested in taking a shot at this, but I'm not clear on the requirements.

Is the xmlstarlet command you posted that actual desired output? You're retrieving four fields from the first WH and only one field from the others.
The WH ids are hard-coded in the command to always be in a certain order. That order appears to be numeric, except that "4865" is out of position.

I'm proposing to scan all products looking for stock entries, then retrieve all of the WH ids. This list then becomes one column (or four columns) per WH in the output. So if a particular product isn't in a given WH, the output would show empty fields for that WH.

However, this means the format of the CSV could change depending on the input. The code I've written so far allows the list of WH ids to be hard-coded so that you could just enter them yourself.

Let me know and I'll finish up what I have.

SDohmen · September 16, 2021, 8:23am

Hello Azhrei,

Thank you for taking a crack at it. To give you hopefully some extra info.

The XMLstarlet command is the current format that we have and need (we import this in a database without headers although those could be added if needed). The first warehouse 4861 is our main warehouse and thats why we need more fields for that one. For all the others we only need the freestock.
It is correct that the 4865 warehouse is out of order because this one got added later on. As long as i know which column is which warehouse i can reorder them if needed so that should not be a problem.

The only important thing is that everything for warehouse 4865 (or any other) is located in its own column and not mixed in with any of the others. All empty calues can be 0 btw if that helps. The 3 products above do not have all warehouses btw but the xmlstarlet command does have them all.

Since i am not sure what you mean with the format change i would have to see it. Is it perhaps an idea to post the bear minimum so i can test run it to see the output on the full json output. In case it changes it might be a good idea to add a header line and i can hustle the columns in the proper order by just loading it into a dataframe and reorder/drop to the correct standard.

Since the code should be as easy as possible (i am still a novice at python) it is even fine if you throw all warehouse data into a column and i will filter them. With the xmlstarlet command it was quite easy for me to pick the correct columns and since i do have a bit of pandas exp, i can do the same with the full flat data if that makes the code easier.

I hope this info makes it better to understand otherwise let me know. Thanks again for the help.

Neo · September 16, 2021, 8:46am

Hey @SDohmen

I realize you want to learn Python and of course, that's very cool.

Just FYI only, in less than the time it takes you to reply in this community topic, you could have installed Ruby, added the Ruby Gem I mentioned, and already be converting JSON to CSV.

Maybe you have a lot of spare time on your hands or are stuck on Python or Bash, but to be honest, Ruby is a better programming language for converting JSON to CSV and it can be installed in less than 15 minutes, from scratch. This means that you could solve the JSON to CSV issue in Ruby in less than 30 minutes.

Keep in mind that Ruby can be "shebanged" like any shell script:

#!/usr/bin/ruby

def your_method_here

end

So, when you write a script, it does not need to be Bash or Ksh or Python, it can be "the best tool for the job" and just "shebang" it

It always give me a smile when people have a task they want to solve and they insist to do it using tools which are sub-optimal for the task; when it is easy to get a "better" tool up and running. Ruby comes OOTB in almost all Linux distributions.

Of course, if you want a Python solution, go for it! Have fun.

So far, it has been a full day since your original post. You could have solved this using Ruby in 30 mins, including install from scratch.

SDohmen · September 16, 2021, 9:11am

Hello Neo,

You are absolutly right that i could have gone for the fast solution but then i would not have been able to learn something new.

In my case it is that currently we are running alot of bash scripts which i am slowly converting to python scripts so i can learn a new language and (hopefully) do fast processing of all the data we want to have. By learning new things i am able to use that knowledge in other scripts and all.

Fast is good but in my case not the best solution for learning purposes.

Skrynesaver · September 16, 2021, 9:23am

The CSV output reflects the data as supplied, unless you wish to put a filter on the loop: untested code below:

import re
valid_name = re.compile("^[A-Z]\d+\s-\s\d+$")
for product in products['data']:
   if ( valid_name.match(product['name'])):
      print(product["idproduct"],";",product["productcode"],product["name"],product["stock"][1]["idwarehouse"])
   else:
      print( f"Invalid record: {product}",file=sys.stderr)

SDohmen · September 16, 2021, 9:44am

Hello Skrynesaver,

Thank you for the code. I checked this but as you mentioned this is untested. It has a problem with the regex for the name which actually not needed. When i call the API i already have the needed filtering in place so i can process all the items in the json reply.

I tried removing the regex part which worked but i then get the same problem as before. It mixes the columns with the warehouses.

With your code the last column values are 4863, 4863 and 4873 in which the last one should have been 0 as it does not have that warehouse.

Skrynesaver · September 16, 2021, 11:36am

The idea of using a regex for the name filed was to reject records such as the one you highlight which have a different pattern, was that not your intention?

Ahh, there's an error in my reading of the data, the pattern should be as below:

 python3
Python 3.8.10 (default, May 20 2021, 11:41:59)
>>> import re
>>> valid_name = re.compile("^[A-Z]\d+\s-\s\d+")
>>> names = [ "A00186918 - 00000A00186918 - Partytent 3x3 met 2 Zijwanden Grijs", "A00189619 - 00000A00189619 - Partytent 3x3 met 4 Zijwanden Grijs", "WD - Western Digital My Passport Go Portable SSD Drive, 1TB, USB 3.0, cobalt blue"]

>>> for name in names:
...   if valid_name.match(name) :
...     print(f"{name} is valid")
...   else:
...     print(f"{name} doesnt meet the criteria")
...
A00186918 - 00000A00186918 - Partytent 3x3 met 2 Zijwanden Grijs is valid
A00189619 - 00000A00189619 - Partytent 3x3 met 4 Zijwanden Grijs is valid
WD - Western Digital My Passport Go Portable SSD Drive, 1TB, USB 3.0, cobalt blue doesnt meet the criteria

SDohmen · September 16, 2021, 11:45am

There is no need for any rejecting. I need to process all stock products in the file. The problem lies with the stock part of the json which mixes as you can see in the first post.

Neo · September 16, 2021, 1:48pm

I thought your task was to do a straight conversion from JSON to CSV, isn't that right @SDohmen ?

OBTW, I ran your JSON data though a JSON linter, and it checked out perfectly OK, so any 'good' json-to-csv library method should work for you.

SDohmen · September 16, 2021, 2:18pm

For this particular script it is mostly a straight down conversion but this same code i will use for other api calls which have the same sort of nested arrays as this one. With the code from this one i am hoping that i will have it easier to convert those to python.

Neo · September 16, 2021, 2:26pm

Somthing like this (untested)?


import json
import csv
 
with open('./data.json') as json_file:
    jsondata = json.load(json_file)
 
data_file = open('./data.csv', 'w', newline='')
csv_writer = csv.writer(data_file)
 
count = 0
for data in jsondata:
    if count == 0:
        header = data.keys()
        csv_writer.writerow(header)
        count += 1
    csv_writer.writerow(data.values())
 
data_file.close()

or maybe just use pandas (untested)?

import pandas as pd

pdObj = pd.read_json('./data.json', orient='index')
csvData = pdObj.to_csv(index=False)
print(csvData)

See also:

https://pandas.pydata.org/

There are countless tutorials on the net where folks convert JSON to CSV using Python. Here is one of many:

Azhrei · September 16, 2021, 7:50pm

Here's my solution. I've tried to add copious comments in areas where I think the code might not be clear for a beginner. Some places have two pieces of code that accomplish similar things, with one piece commented out. I did this as a way of simplifying the code and/or explaining a complicated piece by using a longer, more verbose technique as an alternative.

The function documentation is meant to be processed as reStructured Text and/or Markdown (I think the formatting I used applies to both, but I'm too lazy to check it). Using a (free) tool like PyCharm from Jetbrains.com, you can hover the mouse over a function name to see the formatted documentation in a tooltip.

I left out a few things in pursuit of simplicity, like using try-except blocks to catch exceptions. But have no fear — if Python detects an exception, it'll definitely let you know.

Please ask if you have questions.

This code assumes that the JSON input is in a file named test-data.json in the current directory.

I created it on macOS, so line endings are just <LF> (shouldn't matter to a modern editor).

~~I used an attachment, as the code is rather long. I realize this makes it more difficult for others to comment on it; if this breaks some rule, I presume the moderators will let me know it.~~

#!/usr/bin/env python3

"""
Convert application-defined JSON to a specific CSV format.

# Operational overview

. Convert JSON to Python dictionary (actually, a dictionary of lists
containing dictionaries of lists)

. Generate a list of all warehouse ("WH") ids that occur in the input data.

. Print a CSV header line defining which fields in the CSV are which WH ids.

. Process each entry in the `data` list (these are "products").

.. Create a list that has certain predefined fields from the product.

.. Append to the list, one field for each known WH id.  If the given WH
id doesn't appear in the product, fill in a zero ("0") so that the CSV
field isn't empty.

.. The first WH should emit four values; other WHes are only a single value.

.. Print the list with the given delimiter between fields (set to semicolon;
change the value of `output_delim`, if desired).
"""

import json

output_delim = ";"
fields = [
    "idproduct",
    "idvatgroup",
    "idsupplier",
    "productcode",
    "name",
    "price",
    "fixedstockprice",
    "productcode_supplier",
    "deliverytime",
    "barcode",
    "type",
]


def flatten(prod: dict, prefix: list, whids: list) -> list:
    """
    Takes a single product dictionary and extracts a predefined list of
    fields (the `prefix` list) and warehouse data into a singe list.  If a
    given WH doesn't appear in a product, the return value will include a
    string zero (`"0"`) as a placeholder.

    :param prod: dictionary describing product data
    :param prefix: list of keys whose data should appear at the beginning of the return value
    :param whids: list of all WH ids known
    """
    # Every record has certain predefined fields at the beginning
    result = [str(prod[f]) for f in prefix]

    # Remaining fields are defined by `whids`.
    # They are stored in a list, so we loop through that list.  If the
    # `idwarehouse` exists under prod["stock"], use it; otherwise, use
    # an empty string.

    # Start by creating a dictionary so that future lookups are easier/faster.
    stock_dict = { s["idwarehouse"]: s for s in prod["stock"] }
    for id in whids:
        if id in stock_dict:
            if id == "4861":
                result.extend([stock_dict[id][field]
                    for field in ("freestock", "stock", "reserved", "backorder")])
                # Or, if you find this easier to understand:
                #result.append(str(stock_dict[id]["freestock"]))
                #result.append(str(stock_dict[id]["stock"]))
                #result.append(str(stock_dict[id]["reserved"]))
                #result.append(str(stock_dict[id]["backorder"]))
            else:
                result.append(str(stock_dict[id]["freestock"]))
        else:
            result.append("0")

    return result


def generate_header(prefix: list, whids: list):
    """
    Creates the header to use as the first line of output in the CSV file.
    Since the first WH is treated different from the others, it is sectioned
    off and handled separately here.

    :param prefix: list of product fields that appear on every line of output
    :param whids: list of WH ids that will appear in the output
    """
    result = prefix + [whids[0], "stock", "reserved", "backorder"] + whids[1:]
    return output_delim.join(result)


with open("test-data.json", encoding="utf-8") as content:
    products = json.loads(content.read())
    products = products["data"]

    # Scan all warehouses and make a list of unique values.

    # Create a `set` of all WH ids based on input, then convert to a sorted list.
    #ids = sorted({str(wh["idwarehouse"]) for product in products for wh in product["stock"]})

    # Using a hard-coded list of WH ids
    ids = ["4861", "4865", "4862", "4863", "4870", "4871", "4872", "4873", "5397", "6546" ]

    print(generate_header(fields, ids))
    for product in products:
        # Fields are printed first, followed by all 'freestock' fields from
        # warehouses.
        record = flatten(product, fields, ids)

        # Could use csv.writer() here, but there's not much benefit and
        # importing the module would mean longer startup time (probably
        # negligible, but still).
        print(output_delim.join(record))

Cheers!

SDohmen · September 20, 2021, 10:09am

Hello,

Sorry for the late reply but i had some off days .

This morning i started with the comment from @Neo. Your first code does not work. It gave me an error:

AttributeError                            Traceback (most recent call last)
<ipython-input-8-ff712033b3f6> in <module>
     11 for data in jsondata:
     12     if count == 0:
---> 13         header = data.keys()
     14         csv_writer.writerow(header)
     15         count += 1

AttributeError: 'str' object has no attribute 'keys'

I am testing in anaconda/jupyter btw as thats very handy for these things.

Your second code was more interesting btw. I had some minor fiddling with it and i came across json_normalize. With some more fiddling i got the following code:

import pandas as pd

with open('products.json') as json_file:
    jsondata = json.load(json_file)

jsondata = jsondata["data"]  


pdObj = pd.json_normalize(jsondata)
csvData = pdObj.to_csv(index=False)
pdObj.head()
#print(csvData)

I had to comment the print out as the products file is the complete file and it gave me errors that it was too large to print so i changed it to the head only.

This gave me the following result:

 	idproduct 	idvatgroup 	idsupplier 	productcode 	name 	price 	fixedstockprice 	productcode_supplier 	deliverytime 	description 	... 	tags 	productfields 	images 	stock 	pricelists 	tags.Multi-Colli.idtag 	tags.Multi-Colli.title 	tags.Multi-Colli.color 	tags.Multi-Colli.inherit 	tags.Multi-Colli.textColor
0 	12609836 	10059 	42582.0 	00000P00189918 	P00189918 - 8719689225469 - Garden Royal Easy ... 	0.0 	0.0 	0 	99 	Garden Royal Easy up Partytent 3x3 met 2 Zijwa... 	... 	[] 	[] 	[] 	[{'idwarehouse': 4861, 'stock': 0, 'reserved':... 	[] 	NaN 	NaN 	NaN 	NaN 	NaN
1 	12609870 	10059 	42582.0 	00000P00189919 	P00189919 - 8719689225476 - Garden Royal Easy ... 	0.0 	0.0 	0 	99 	Garden Royal Easy up Partytent 3x3 met 4 Zijwa... 	... 	[] 	[] 	[] 	[{'idwarehouse': 4861, 'stock': 0, 'reserved':... 	[] 	NaN 	NaN 	NaN 	NaN 	NaN
2 	10791362 	9520 	42582.0 	E0718037867724 	WD - Western Digital My Passport Go Portable S... 	0.0 	0.0 	-NA- 	99 	None 	... 	[] 	[] 	[] 	[{'idwarehouse': 4871, 'stock': 0, 'reserved':... 	[] 	NaN 	NaN 	NaN 	NaN 	NaN
3 	10965410 	9520 	42582.0 	E0814914025079 	Crucial MX100 250GB Sata SSD 	0.0 	0.0 	0 	99 		... 	[] 	[] 	[] 	[{'idwarehouse': 4861, 'stock': 0, 'reserved':... 	[] 	NaN 	NaN 	NaN 	NaN 	NaN
4 	10503336 	9520 	42582.0 	K00000001 	Grand Canyon Phoenix M - Koepeltent - 3-Persoo... 	69.0 	46.7 		None 	Dit betreft een artikel uit onze koopjeshoek.\... 	... 	[] 	[] 	[https://files-cdn.picqer.net/3025/8fc1ec35-5e... 	[{'idwarehouse': 4861, 'stock': 0, 'reserved':... 	[{'idpricelist': 769, 'price': 0}, {'idpriceli... 	NaN 	NaN 	NaN 	NaN 	NaN

5 rows × 36 columns

Which is close but again it does not flatten the stock part. Since i am still new to json_normalize i have to tinker a bit more with the code in the hopes that i can flatten the stock part and then hopefully have them in order also.

@Azhrei i will test your code now to see what happens and edit this comment to let you know.

Neo · September 20, 2021, 10:23am

Hey @SDohmen

I think you do not understand how to work with others in forums.

First of all, I did not tell you that code sample was for you to use without refining it.

I gave you that example so you would actually modify it yourself and do your own work.

When people post code for you, you should not expect a "finished product". We are here to help you learn to write your own code, and we are not a code writing service where you get code from us, test it, find errors and ask us to fix it, while you play the role of manager.

I am closing this topic @SDohmen because you seem to think you are our manager and you have no obligation to code your own solutions; but instead you are asking us to code for you while you play the role of tester.

This is against our forum rules. Everyone here must do their own work and code their own solutions. We offer guidance and help; not a code writing service for people without coding skills.

Neo · September 20, 2021, 10:23am

@SDohmen

Post back again when you are willing to do your own work and not act as "our manager", getting us to write code for you while you test and "report back" and ask us to fix it.

We are NOT here to write code for you. We are here to help you write your own code.,