API¶
This part of the documentation covers all the interfaces of Tablib. For parts where Tablib depends on external libraries, we document the most important right here and provide links to the canonical documentation.
Dataset Object¶
-
class
tablib.
Dataset
(*args, **kwargs)[source]¶ -
The
Dataset
object is the heart of Tablib. It provides all core functionality.Usually you create a
Dataset
instance in your main module, and append rows as you collect data.data = tablib.Dataset() data.headers = ('name', 'age') for (name, age) in some_collector(): data.append((name, age))
Setting columns is similar. The column data length must equal the current height of the data and headers must be set
data = tablib.Dataset() data.headers = ('first_name', 'last_name') data.append(('John', 'Adams')) data.append(('George', 'Washington')) data.append_col((90, 67), header='age')
You can also set rows and headers upon instantiation. This is useful if dealing with dozens or hundreds of
Dataset
objects.headers = ('first_name', 'last_name') data = [('John', 'Adams'), ('George', 'Washington')] data = tablib.Dataset(*data, headers=headers)
Parameters: - *args – (optional) list of rows to populate Dataset
- headers – (optional) list strings for Dataset header row
- title – (optional) string to use as title of the Dataset
Format Attributes Definition
If you look at the code, the various output/import formats are not defined within the
Dataset
object. To add support for a new format, see Adding New Formats.-
add_formatter
(col, handler)[source]¶ -
Adds a formatter to the
Dataset
.New in version 0.9.5: :param col: column to. Accepts index int or header str. :param handler: reference to callback function to execute against each cell value.
-
append
(row, tags=[])[source]¶ -
Adds a row to the
Dataset
. SeeDataset.insert
for additional documentation.
-
append_col
(col, header=None)[source]¶ -
Adds a column to the
Dataset
. SeeDataset.insert_col
for additional documentation.
-
csv
¶ -
A CSV representation of the
Dataset
object. The top row will contain headers, if they have been set. Otherwise, the top row will contain the first row of the dataset.A dataset object can also be imported by setting the
Dataset.csv
attribute.data = tablib.Dataset() data.csv = 'age, first_name, last_name\n90, John, Adams'
Import assumes (for now) that headers exist.
Binary Warning
Dataset.csv
uses rn line endings by default, so make sure to write in binary mode:with open('output.csv', 'wb') as f: f.write(data.csv)
If you do not do this, and you export the file on Windows, your CSV file will open in Excel with a blank line between each row.
-
dbf
¶ -
A dBASE representation of the
Dataset
object.A dataset object can also be imported by setting the
Dataset.dbf
attribute.# To import data from an existing DBF file: data = tablib.Dataset() data.dbf = open('existing_table.dbf').read() # to import data from an ASCII-encoded bytestring: data = tablib.Dataset() data.dbf = '<bytestring of tabular data>'
Binary Warning
Dataset.dbf
contains binary data, so make sure to write in binary mode:with open('output.dbf', 'wb') as f: f.write(data.dbf)
-
df
¶ -
A DataFrame representation of the
Dataset
object.A dataset object can also be imported by setting the
Dataset.df
attribute:data = tablib.Dataset() data.df = DataFrame(np.random.randn(6,4))
Import assumes (for now) that headers exist.
-
dict
¶ -
A native Python representation of the
Dataset
object. If headers have been set, a list of Python dictionaries will be returned. If no headers have been set, a list of tuples (rows) will be returned instead.A dataset object can also be imported by setting the Dataset.dict attribute:
data = tablib.Dataset() data.dict = [{'age': 90, 'first_name': 'Kenneth', 'last_name': 'Reitz'}]
-
export
(format, **kwargs)[source]¶ -
Export
Dataset
object to format.Parameters: **kwargs – (optional) custom configuration to the format export_set.
-
extend
(rows, tags=[])[source]¶ -
Adds a list of rows to the
Dataset
usingDataset.append
-
filter
(tag)[source]¶ -
Returns a new instance of the
Dataset
, excluding any rows that do not contain the given tags.
-
headers
¶ -
An optional list of strings to be used for header rows and attribute names.
This must be set manually. The given list length must equal
Dataset.width
.
-
html
¶ -
A HTML table representation of the
Dataset
object. If headers have been set, they will be used as table headers...notice:: This method can be used for export only.
-
insert
(index, row, tags=[])[source]¶ -
Inserts a row to the
Dataset
at the given index.Rows inserted must be the correct size (height or width).
The default behaviour is to insert the given row to the
Dataset
object at the given index.
-
insert_col
(index, col=None, header=None)[source]¶ -
Inserts a column to the
Dataset
at the given index.Columns inserted must be the correct height.
You can also insert a column of a single callable object, which will add a new column with the return values of the callable each as an item in the column.
data.append_col(col=random.randint)
If inserting a column, and
Dataset.headers
is set, the header attribute must be set, and will be considered the header for that row.See Dynamic Columns for an in-depth example.
Changed in version 0.9.0: If inserting a column, and
Dataset.headers
is set, the header attribute must be set, and will be considered the header for that row.
-
json
¶ -
A JSON representation of the
Dataset
object. If headers have been set, a JSON list of objects will be returned. If no headers have been set, a JSON list of lists (rows) will be returned instead.A dataset object can also be imported by setting the
Dataset.json
attribute:data = tablib.Dataset() data.json = '[{"age": 90, "first_name": "John", "last_name": "Adams"}]'
Import assumes (for now) that headers exist.
-
latex
¶ -
A LaTeX booktabs representation of the
Dataset
object. If a title has been set, it will be exported as the table caption.Note
This method can be used for export only.
-
load
(in_stream, format=None, **kwargs)[source]¶ -
Import in_stream to the
Dataset
object using the format.Parameters: **kwargs – (optional) custom configuration to the format import_set.
-
lpush
(row, tags=[])[source]¶ -
Adds a row to the top of the
Dataset
. SeeDataset.insert
for additional documentation.
-
lpush_col
(col, header=None)[source]¶ -
Adds a column to the top of the
Dataset
. SeeDataset.insert
for additional documentation.
-
ods
¶ -
An OpenDocument Spreadsheet representation of the
Dataset
object, with Separators. Cannot be set.Binary Warning
Dataset.ods
contains binary data, so make sure to write in binary mode:with open('output.ods', 'wb') as f: f.write(data.ods)
-
remove_duplicates
()[source]¶ -
Removes all duplicate rows from the
Dataset
object while maintaining the original order.
-
rpush
(row, tags=[])[source]¶ -
Adds a row to the end of the
Dataset
. SeeDataset.insert
for additional documentation.
-
rpush_col
(col, header=None)[source]¶ -
Adds a column to the end of the
Dataset
. SeeDataset.insert
for additional documentation.
-
sort
(col, reverse=False)[source]¶ -
Sort a
Dataset
by a specific column, given string (for header) or integer (for column index). The order can be reversed by settingreverse
toTrue
.Returns a new
Dataset
instance where columns have been sorted.
-
stack
(other)[source]¶ -
Stack two
Dataset
instances together by joining at the row level, and return new combinedDataset
instance.
-
stack_cols
(other)[source]¶ -
Stack two
Dataset
instances together by joining at the column level, and return a new combinedDataset
instance. If eitherDataset
has headers set, than the other must as well.
-
subset
(rows=None, cols=None)[source]¶ -
Returns a new instance of the
Dataset
, including only specified rows and columns.
-
transpose
()[source]¶ -
Transpose a
Dataset
, turning rows into columns and vice versa, returning a newDataset
instance. The first row of the original instance becomes the new header row.
-
tsv
¶ -
A TSV representation of the
Dataset
object. The top row will contain headers, if they have been set. Otherwise, the top row will contain the first row of the dataset.A dataset object can also be imported by setting the
Dataset.tsv
attribute.data = tablib.Dataset() data.tsv = 'age first_name last_name\n90 John Adams'
Import assumes (for now) that headers exist.
-
xls
¶ -
A Legacy Excel Spreadsheet representation of the
Dataset
object, with Separators. Cannot be set.Note
XLS files are limited to a maximum of 65,000 rows. UseDataset.xlsx
to avoid this limitation.Binary Warning
Dataset.xls
contains binary data, so make sure to write in binary mode:with open('output.xls', 'wb') as f: f.write(data.xls)
-
xlsx
¶ -
An Excel ‘07+ Spreadsheet representation of the
Dataset
object, with Separators. Cannot be set.Binary Warning
Dataset.xlsx
contains binary data, so make sure to write in binary mode:with open('output.xlsx', 'wb') as f: f.write(data.xlsx)
-
yaml
¶ -
A YAML representation of the
Dataset
object. If headers have been set, a YAML list of objects will be returned. If no headers have been set, a YAML list of lists (rows) will be returned instead.A dataset object can also be imported by setting the
Dataset.yaml
attribute:data = tablib.Dataset() data.yaml = '- {age: 90, first_name: John, last_name: Adams}'
Import assumes (for now) that headers exist.
Databook Object¶
-
class
tablib.
Databook
(sets=None)[source]¶ -
A book of
Dataset
objects.-
export
(format, **kwargs)[source]¶ -
Export
Databook
object to format.Parameters: **kwargs – (optional) custom configuration to the format export_book.
-
Functions¶
Exceptions¶
-
class
tablib.
InvalidDatasetType
[source]¶ -
You’re trying to add something that doesn’t quite look right.
-
class
tablib.
InvalidDimensions
[source]¶ -
You’re trying to add something that doesn’t quite fit right.
-
class
tablib.
UnsupportedFormat
[source]¶ -
You’re trying to add something that doesn’t quite taste right.
Now, go start some Tablib Development.