What is CSV?

CSV (Comma-Separated Values) is a simple file format used to store and exchange tabular data. It is widely used in data analysis, spreadsheets, databases, and data exchange between applications.

1. Why Use CSV?

2. CSV File Structure

Each line in a CSV file represents a row, and values are separated by commas.

Name, Age, Email
John Doe, 30, johndoe@example.com
Alice Smith, 25, alicesmith@example.com

3. CSV Delimiters

Other delimiters like semicolons or tabs can be used:

Semicolon (`;`)

Name; Age; Email
John Doe; 30; johndoe@example.com

Tab-separated (`\t`)

Name    Age    Email
John Doe    30    johndoe@example.com

4. Working with CSV Files

In Python

Reading a CSV File

import csv

with open("data.csv", "r") as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

Writing to a CSV File

import csv

data = [
    ["Name", "Age", "Email"],
    ["John Doe", 30, "johndoe@example.com"],
    ["Alice Smith", 25, "alicesmith@example.com"]
]

with open("output.csv", "w", newline="") as file:
    writer = csv.writer(file)
    writer.writerows(data)

In JavaScript

Parsing CSV Data

let csvText = "Name,Age,Email\nJohn Doe,30,johndoe@example.com";
let rows = csvText.split("\n").map(row => row.split(","));
console.log(rows);

5. CSV vs Other Formats

Feature CSV JSON XML
Readability Simple, human-readable Readable, but structured Verbose, hierarchical
Structure Flat (rows/columns) Key-value pairs Tree-based

6. Handling Special Cases

1. Handling Commas in Values

Values containing commas must be enclosed in quotes (`"`).

Name, Age, Address
"John Doe", 30, "123 Main St, New York"

2. Handling New Lines in Values

Use quotes around values that contain line breaks.

Name, Description
"Product 1", "This is a great product.
It has multiple features."

3. Handling Missing Data

Use empty fields for missing values.

Name, Age, Email
John Doe, 30, johndoe@example.com
Alice Smith, , alicesmith@example.com

7. Where is CSV Used?

Conclusion

CSV is a simple and effective format for storing tabular data. It is widely supported across applications and is easy to process programmatically.