An important thing to remember about fancy type systems is that they can only ma...

shepherdjerred · on Sept 4, 2023

This is a fantastic point, and a huge issue that I had with TypeScript.

Libraries for TypeScript like Zod[0] allow you to define types and validate against them at runtime. For example: https://stackblitz.com/edit/typescript-pwzng4?file=index.ts

At work, we use this to validate API responses, local storage contents, or URL/router data. It solves the problem you described -- dirty data coming from outside of the type system.

This isn't a plug for Zod or TypeScript -- there are other similar libraries for TypeScript, and I would imagine other statically typed languages have something to fill this role.

[0]: https://zod.dev/

diarrhea · on Sept 4, 2023

Two random thoughts:

- why is there `safeParse`? Why is parsing not safe by default? Have an `unsafeParse` as an escape hatch, not the other way around.

- the imperative check for `success` seems unfortunate. Then again, I am familiar with pydantic from Python, which seems awfully similar to Zod, and there you'd get an Exception on failure which you'd have to catch. Some would call that uglier (I don't). Point being, I don't know of a better alternative, but sure hope native liquid/dependent types wouldn't suffer from this anymore

shepherdjerred · on Sept 4, 2023

You can use `parse` which will throw an exception if the data does not match the schema, or `safeParse` if you want to manually check.

> the imperative check for `success` seems unfortunate.

Yes! But the cool thing is that `success` being `true` narrows the type and ensures that the `data` field is present. You cannot access `data` without checking for success

Here's another example showing those concepts: https://stackblitz.com/edit/typescript-wbuikz?file=index.ts

mullsork · on Sept 4, 2023

JS doesn't have any useful built-in way to deal with success/failure, other than exceptions. We use Boxed[1] for this:

  Result
    .fromExecution(() => schema.parse(input))
    .mapOk(parsed_input => /* */)
    .mapError(parse_errors => /* */)

which is an alright way to deal with things IMO.

I would hazard a guess that `parse` is a function introduced long ago, and `safeParse` came afterwards. (Safe as in "do not throw an exception")

[1]: https://github.com/swan-io/boxed

Cannabat · on Sept 4, 2023

`schema.parse()` is equivalent to pydantic `MyModel.parse_obj()` and throws if invalid.

`const result = schema.safeParse(val)` does not throw, and if `!result.success`, `result.error` contains a `ZodError` object with all the details. It's common to further transform this error object, perhaps to make it more human-readable or suitable for an API response.

Typically I find myself using `safeParse` far more than `parse`.

svrtknst · on Sept 4, 2023

TypeScript became so much more pleasant to use after we started using parsers regularly. We use Runtypes in some projects and a hand rolled lib I wrote in others, and being able to parse unknown data into a known shape is a life saver.

Glad to not have to write seemingly thousands of lines of interfaces and type guards any more.

PartiallyTyped · on Sept 4, 2023

For certain data you can exhaust the space of values and thoroughly test that your parser validates only the correct datatypes, and if you can do that, then you can be fearless in your impl with those "fancy type systems" because you guarantee that within the context of the impl, the software is sound. See [1], and the respective discussion here [2].

[1] https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

[2] https://news.ycombinator.com/item?id=35053118

IshKebab · on Sept 4, 2023

Most modern languages with static types have some way to preserve them across network/file transports. You can use a serialisation format that is typed (e.g. FlatBuffers etc.) or a library that validates the types on load, e.g. Serde. Even Python has Pydantic.

skybrian · on Sept 4, 2023

Type systems for serialization (protobufs and flatbuffers) are partially closed-world in the sense that we assume all past and future versions of the same schema come from the same place and we assume the developers followed certain rules when updating the schema. Because they allow for version mismatches, they’re designed rather differently than most native type systems for programming languages. There’s often an impedance mismatch with the native type system.

Validating on load is a form of runtime type checking. It’s not relying on a type assertion, which would allow the compiler to omit the type-check and the programmer to omit the error-handling when the check fails. Instead of preserving knowledge about the type constraints on some data, we prove the constraints again.

IshKebab · on Sept 4, 2023

> Validating on load is a form of runtime type checking. It’s not relying on a type assertion, which would allow the compiler to omit the type-check and the programmer to omit the error-handling when the check fails.

You can totally do that if you want and you fully trust that the thing that generated the data used the right types.

Similar to what you do every time you use a dynamic library.

skybrian · on Sept 4, 2023

Yes, you could, but not verifying input usually isn't advisable. It's making a closed-world assumption that isn't warranted in an environment where there are no guarantees. Networks are all about sharing data with other organizations. Filesystems contain files with uncertain origins.

A type system should be designed based on what invariants you can reasonably expect in the environment where the system operates, versus what you can expect will change. For example, Swift has an interesting type system that's specifically designed to make upgrading dynamic libraries easier and safer. Apple can make assumptions about what sort of changes there will and won't be to the libraries that ship with its operating systems.

Contrast with Go, which makes the assumption is that there are no dynamic libraries; all Go source code is seen by the compiler.

IshKebab · on Sept 4, 2023

> Yes, you could, but not verifying input usually isn't advisable.

It's not not verifying input. It's just not verifying the types. There are no additional security issues unless you do something spectacularly stupid.

For example you can load random bytes as a Protobuf file. It never validates that the data is in fact the right type, it just loads the bytes assuming they are. The only issue you'll see is junk data (or a parse error).

In any case I think you know that. I'm not really sure what point you were trying to make.

skybrian · on Sept 5, 2023

I'm not that clear about what point you're making either. We seem to talking past each other?

The way I see it, types are sometimes used to represent security guarantees, so junk data getting cast to the wrong type could be a security hole. It depends on which types we're talking about.

For example, you can declare a SafeHTML type whose underlying representation is a string, and if you somehow deserialize it without actually doing the runtime check, it could be a security bug.

Verifying types and verifying input are closely related; the type keeps track of what you verified when you constructed an object of that type.