KV Data Loading: Logging #56

thegreatfatzby · 2024-05-31T04:54:45Z

For production use, will the application do full/normal logging specifically for data loading? I would this would be OK from a privacy perspective as the entity loading the data is the ad tech, so they can't gain any new information from data load metrics, and from an operational perspective it will happen that data loads fail for odd reasons and you'll want to get the "failed rows", failed reasons, etc.

peiwenhu · 2024-05-31T13:50:47Z

Yes for logic that process requests after the requests are decrypted, it requires certain protections

For logic unrelated to processing requests, they are considered "safe" and logs/metrics can be exported as-is.

Btw for data loading failures, we're interested to hear what you think the requirements are for handling row failures: other than skipping the row and logging/recording a metric, do you expect other error handling behaviors such as only committing a whole file or a group of rows in the file if all rows are successfully read?

thegreatfatzby · 2024-05-31T21:40:29Z

@peiwenhu interesting question indeeds, I'd say there's no one right answer there for a generic tool, allowing skipping and logging of bad rows I think will likely be important, some configureability seems warranted.

Without logging it takes quite the Jedi to find bad rows. Given this data is onboarded by the ad tech, telling them which rows were rejected seems safe to me.

For stopping the entire file or some other type of atomicity, you can definitely see both cases (i.e. some data sets if a row is bad you want to move on and not stop the train, for others it's real important to get some changeset atomically). In theory if you only support skip, clients can adjust to that with some costs.

Also going to ping some other experts here @swapnilpandit and @truemike and others once I find their handles.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KV Data Loading: Logging #56

KV Data Loading: Logging #56

thegreatfatzby commented May 31, 2024

peiwenhu commented May 31, 2024

thegreatfatzby commented May 31, 2024 •

edited

Loading

KV Data Loading: Logging #56

KV Data Loading: Logging #56

Comments

thegreatfatzby commented May 31, 2024

peiwenhu commented May 31, 2024

thegreatfatzby commented May 31, 2024 • edited Loading

thegreatfatzby commented May 31, 2024 •

edited

Loading