Layer 1: Model
Pure data structures with no dependencies. Pydantic models for serialization.
Dataset Models
ftm_lakehouse.model.DatasetModel
Bases: Dataset
Source code in ftm_lakehouse/model/dataset.py
public_url_prefix = None
class-attribute
instance-attribute
Public url prefix for resources
storage = None
class-attribute
instance-attribute
Set storage for external lakehouse
ftm_lakehouse.model.CatalogModel
File Model
ftm_lakehouse.model.File
Bases: Stats
File metadata model. Arbitrary data can be stored in extra, including
ftm properties that should be added to the generated Entity
Source code in ftm_lakehouse/model/file.py
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 | |
blob_path
property
Relative path to blob in dataset archive
checksum
instance-attribute
SHA256 checksum (often referred to as content_hash)
dataset
instance-attribute
Dataset name
extra = {}
class-attribute
instance-attribute
Arbitrary extra data
id
property
The entity id is generated by a hash of the file path and the checksum. Uses just the checksum as id if that's the key
meta_path
property
Relative path for this file's metadata json in dataset archive
origin = None
class-attribute
instance-attribute
Origin stage of this file
make_parents()
to_entity()
Make an entity for this File
Source code in ftm_lakehouse/model/file.py
Mapping Models
ftm_lakehouse.model.DatasetMapping
Bases: BaseModel
A complete mapping configuration for a dataset file.
Source code in ftm_lakehouse/model/mapping.py
Job Models
ftm_lakehouse.model.JobModel
Bases: BaseModel
Status model for a (probably long running) job
Source code in ftm_lakehouse/model/job.py
ensure_run_id(value=None)
classmethod
ftm_lakehouse.model.DatasetJobModel
Bases: JobModel
Status model for a (probably long running) job bound to a dataset
Source code in ftm_lakehouse/model/job.py
CRUD Models
ftm_lakehouse.model.Crud
Bases: BaseModel
Payload model for CRUD queue operations.
All lakehouse mutations go through this single queue, ordered by UUID7. The queue key (UUID7) is managed by anystore.Queue, not stored in the model.