I cannot be sure that CSV or JSON would perform differently, better or worse, when communicating knowledge to an LLM. I think in most cases previously I would just pass in regular text or maybe stuff I copy and paste from a webpage. I thought markdown was a nice balance of human readability with the ability to mimic the tabular format in a machine readable way that you find in the schema documentation.
I do know that when embedding documents for vector inference, you have to chunk them into logical pieces. Markdown makes that easy because it has a defined way of capturing headers with the # symbol. When i embed documents for inference, I was finding it easy to chunk the documents based on those headers, and i figure engineers training models likely have similar pipelines using tools like Llama Parse.
As with a lot of AI / ML stuff, it is great when you can pass in your question in a format similar to the the format it has seen a lot of when training the model… I bet YAML does a good job, because of the thousands of stack overflow posts… lol. But maybe markdown is more broadly usable and easier to keep organized on my computer