Model field reference

This document contains all API references of BaseClass BaseField including the field options and field types data_migrator offers.

Note

Technically, these models are defined in data_migrator.models.fields , but for convenience they’re imported into data_migrator.models, the standard convention is to use from data_migrator import models and refer to fields as models.<Foo>Field

Field options

class data_migrator.models.fields.BaseField(pos=-1, name='', default=None, nullable='NULL', key=False, required=False, replacement=None, parse=None, validate=None, anonymize=None, max_length=None, unique=False, validate_output=None)[source]

Base column definition for the transformation DSL

The following arguments are available to all field types. All are optional.

Parameters:
  • pos (int) – If positive or zero this denotes the column in the source data to select and store in this field. If not set (or negative) the fields is interpreted as not selecting just a column from the source but to take the full row in the parse function
  • name (str) – The name of this field. By default this is the name provided in the model declaration. This attribute is to replace that name by the final column name.
  • default – The default value to use if the source column is found to be a null field or if the parse function returns None. This attribute has default values for Fields that are not Null<xxx>Fields. For example NullStringField has both NULL and empty string as empty value. StringField only has empty string as empty value. With this field it can be changed to some other standard value. Consider a Country field as string and setting it to the home country by default.
  • key (boolean) – If set, this indicates the field is a key field for identification of the object.
  • nullable (str) – If set it will match the source column value and consider this a None value. By default this attribute is set to None. Note that for none Null fields None will be translated to default.
  • replacement – If set, this is a pre-emit replacement function. This could be used to insert dynamic replacement lookup select queries, adding more indirection into the data generation. Value could be either function or a string.
  • required (boolean) – If set, this indicates the field is required to be set.
  • parse – If set this is the parsing function to replace the read value into something to use further down the data migration. Use this for example to clean phone numbers, translate country definitions into alpha3 codes, or to translate ID’s into values based on a separately loaded lookup table.
  • validate – Expects a function that returns a boolean, and used to validate the input data. Expecting data within a range or a specific format, add a column validator here. Raises ValidationException if set and false.
  • max_length (int) – In case of StringField use this to trim string values to maximum length.
  • unique (boolean) –

    If True, data-migrator will check uniqueness of intermediate values (after parsing). Default is False.

    In relationship with the default manager this will keep track of values for this field. The manager can raise exceptions if uniqueness is violated. Note that it is up to the manager to either fail or drop the record if the exception is raised.

  • anonymizer – Add an additional function that will be called at emit to anonymize the data
  • validate_output – A pre-emit validator used to scan the bare output and raise exceptions if output is not as expected.
  • creation_order – An automatically generated attribute used to determine order of specification, and used in the emitting of dataset.
emit(v, escaper=None)[source]

helper function to export this field.

Expects a value from the model to be emitted

Parameters:
  • v – value to emit
  • escaper – escaper function to apply on value
Returns:

emitted value.

Raises:

ValidationException – raised if explicit validation fails.

json_schema(name=None)[source]

generate json_schema representation of this field

Parameters:name – name if not taken from this field
Returns:python representation of json schema
scan(row)[source]

scan row and harvest distinct value.

Takes a row of data and parses the required fields out of this.

Parameters:row (list) – array of source data
Returns:parsed and processed value.
Raises:ValidationException – raised if explicit validation fails.

Note

Use this with HiddenField and a row parse function if some combination of fields (aka a compound key) is expected to be unique and not to be violated.

Field types

class data_migrator.models.fields.ArrayField(pos=-1, name='', default=None, nullable='NULL', key=False, required=False, replacement=None, parse=None, validate=None, anonymize=None, max_length=None, unique=False, validate_output=None)[source]

JSON array field

class data_migrator.models.fields.BooleanField(pos=-1, name='', default=None, nullable='NULL', key=False, required=False, replacement=None, parse=None, validate=None, anonymize=None, max_length=None, unique=False, validate_output=None)[source]

Boolean field handler.

a bool that takes any cased permutation of true, yes, 1 and translates this into True or False otherwise.

class data_migrator.models.fields.DateTimeField(f=None, **kwargs)[source]

Basic datetime field handler

__init__(f=None, **kwargs)[source]
Parameters:f – format of the datetime Default is %Y-%m-%dT%H:%M:%SZ (RFC3999)
emit(v, escaper=None)[source]

helper function to export this field.

Expects a value from the model to be emitted

Parameters:
  • v – value to emit
  • escaper – escaper function to apply on value
Returns:

emitted value.

Raises:

ValidationException – raised if explicit validation fails.

data_migrator.models.fields.DictField

alias of data_migrator.models.fields.ObjectField

class data_migrator.models.fields.HiddenField(pos=-1, name='', default=None, nullable='NULL', key=False, required=False, replacement=None, parse=None, validate=None, anonymize=None, max_length=None, unique=False, validate_output=None)[source]

Non emitting Field for validation and checking.

a field that accepts, but does not emit. It is useful for uniqueness checked and more. Combine this with a row parse and check the complete row.

class data_migrator.models.fields.IntField(pos=-1, name='', default=None, nullable='NULL', key=False, required=False, replacement=None, parse=None, validate=None, anonymize=None, max_length=None, unique=False, validate_output=None)[source]

Basic integer field handler

data_migrator.models.fields.IntegerField

alias of data_migrator.models.fields.IntField

class data_migrator.models.fields.JSONField(pos=-1, name='', default=None, nullable='NULL', key=False, required=False, replacement=None, parse=None, validate=None, anonymize=None, max_length=None, unique=False, validate_output=None)[source]

a field that takes the values and spits out a JSON encoding string. Great for maps and lists to be stored in a string like db field.

emit(v, escaper=None)[source]

Emit is overwritten to add the to_json option.

data_migrator.models.fields.ListField

alias of data_migrator.models.fields.ArrayField

class data_migrator.models.fields.MappingField(data_map, as_json=False, strict=False, **kwargs)[source]

Map based field translator.

a field that takes the values translates these according to a map. Great for identity column replacements. If needed output can be translated as json, for example if the map returns lists.

__init__(data_map, as_json=False, strict=False, **kwargs)[source]
Parameters:
  • data_map – The data_map needed to translate. Note the fields returns default if it is not able to map the key.
  • as_json – If True, the field will be output as json encoded. Default is False
  • strict – If True, the value must by found in the map. Default is False
emit(v, escaper=None)[source]

Emit is overwritten to add the to_json option

class data_migrator.models.fields.ModelField(fields, strict=None, **kwargs)[source]

Model relation for hierarchical structures.

a field that takes another model to build hierarchical structures.

__init__(fields, strict=None, **kwargs)[source]
Parameters:
  • fields – relationship to another model.
  • strict (boolean) – model is considered strict.
emit(v, escaper=None)[source]

Emit is overwritten to add the to_json option

json_schema(name=None)[source]

generate json_schema representation of this field

Parameters:name – name if not taken from this field
Returns:python representation of json schema
class data_migrator.models.fields.NullField(pos=-1, name='', default=None, nullable='NULL', key=False, required=False, replacement=None, parse=None, validate=None, anonymize=None, max_length=None, unique=False, validate_output=None)[source]

NULL returning field by generating None

json_schema(name=None)[source]

generate json_schema representation of this field

class data_migrator.models.fields.NullIntField(pos=-1, name='', default=None, nullable='NULL', key=False, required=False, replacement=None, parse=None, validate=None, anonymize=None, max_length=None, unique=False, validate_output=None)[source]

Null integer field handler.

a field that accepts the column to be integer and can also be None, which is not the same as 0 (zero).

class data_migrator.models.fields.NullStringField(pos=-1, name='', default=None, nullable='NULL', key=False, required=False, replacement=None, parse=None, validate=None, anonymize=None, max_length=None, unique=False, validate_output=None)[source]

Null String field handler.

a field that accepts the column to be string and can also be None, which is not the same as empty string (“”).

class data_migrator.models.fields.ObjectField(pos=-1, name='', default=None, nullable='NULL', key=False, required=False, replacement=None, parse=None, validate=None, anonymize=None, max_length=None, unique=False, validate_output=None)[source]

JSON object field

class data_migrator.models.fields.StringField(pos=-1, name='', default=None, nullable='NULL', key=False, required=False, replacement=None, parse=None, validate=None, anonymize=None, max_length=None, unique=False, validate_output=None)[source]

String field handler, a field that accepts the column to be string.

class data_migrator.models.fields.UTCNowField(f=None, **kwargs)[source]

UTCNow generating field.

a field that generates a UTCNow

class data_migrator.models.fields.UUIDField(*args, **kwargs)[source]

UUID generating field.

a field that generates a str(uuid.uuid4())