[Pharo-users] The opposite of encodeForHTTP

Norbert Hartl norbert at hartl.name
Sat Jul 21 07:41:00 EDT 2012


Am 20.07.2012 um 20:53 schrieb Brenda Larcom:

> Thanks, Norbert; I'll take a look at Zinc, see how my existing code might integrate, and propose something specific.  I personally think having an insecure option for things like URIs and HTTP that are inherently on borders almost all the time is unwise, but I'm happy to resolve my personal issues via documentation.  :)
> 
You said "almost" yourself :) I just wanted to say that different people have different ideas. Restricting software to what we can imagine is like avoiding that other people realize amazing things we couldn't imagine. 

> One reason validating parsers are so powerful is that when layers stack as you mentioned, the security starts working as soon as the functional part does.  I agree, such a parser definitely belongs at the borders of interpretation schemes, not inside them.  Inside them it'll just use up time without providing value.  Conveniently, the tool people naturally reach for at interpretation borders usually has a parser in it someplace.
> 
> And yes, there do seem to be a particular lot of fiddly bits in URIs.  So fiddly a few of the examples in the RFCs (as usual) don't match the rest of the spec.
> 
agreed. I'm eager to see what you'll come up with. 

Norbert
> 
> On Jul 20, 2012, at 10:50 AM, Norbert Hartl <norbert at hartl.name> wrote:
> 
>> Brenda,
>> 
>> these are all good points as you said from a "security architecture perspective" and we should improve on that. The zinc http components do already a good job in structuring the entities as they should be. I think security add-ons can hook onto what is already there. There is a huge amount of things to consider. Even for a single URL the different components of an url have different encoding needs. 
>> On the other hand security is not a major target in a lot of use cases I can imagine. There is at least (for me) a triangle of security - performance - usability that makes it hard to have a single approach to fit them all. And we smalltalkers tend to judge freedom very high if it comes to program. In other words I would say we like to preserve the freedom of designing an insecure application at will :) The best way to solve those issues is by being modular, meaning a layer that can be put on top of the existing stuff to fulfill a particular use case.
>> The things you describe are present in a lot of environments. I mostly call this a "at the border of a system" problem. Things like strings inside of an environment are harmless. Problems appear if you cross system borders, meaning you cross interpretation schemes. And this a topic more broad then only HTTP. 
>> If we look at a widely known problem like sql injection there is not only the need for proper entity handling but for stacking validators and converters for different problems. It is such a big thing because you have an URL that goes through middleware and ends in a storage system like an SQL database. Here you cross at least two borders: HTTP to middleware and middleware to database. So you need to stack up converters and validators for HTTP, probably shell escapes in a middleware and finally for SQL. I think if you can assemble those things by the layers you use a security approach is doable. And for the same reason it goes so terribly wrong everywhere. 
>> So what does this modular thing mean? To have a lot of possibilities to fulfill certain needs without restricting everyone to a single scheme. 
>> My advice would be to have a look at the zinc components and propose things to improve from your perspective. Then publish your results here and there will be a lot of clever people finding a good way to integrate it in a modular way.
>> 
>> I hope this helps,
>> 
>> Norbert
>> 
>> Am 20.07.2012 um 18:25 schrieb Brenda Larcom:
>> 
>>> I suppose I could unlurk at this point.  :)
>>> 
>>> I'm a security geek (specifically, a secure development geek focusing on security architecture) in my day job, and I have a long unmaintained architecture security analysis tool written in Squeak (http://www.octotrike.org/ for the curious), which I have been unmothballing.  We are considering switching to Pharo, partly because we are planning to add some P2P collaboration features we think have an HTTP layer in there somewhere & partly because we like it small, tidy, and self-compatible.  Hence my lurking.
>>> 
>>> I've done some work on how data validation should be done for security purposes, for my day job.  This includes output encoding and decoding, like what Davide is talking about.  It's pretty tricky to get right because of the large number of contexts, with subtly different rules.  E.g. I would expect encodeForHTTP to be appropriate for HTTP headers, except that e.g. two things you usually want to put in HTTP headers are URIs and cookies, each of which have different rules (for different subparts, even) for what should be encoded.  The differences don't seem like much, but in the wild, my coworkers & I see these sorts of differences lead to vulnerabilities on a daily basis.
>>> 
>>> From a security architecture perspective, the absolute best way to handle encoding & decoding for a structured object like an HTTP request or response (or a URI, or a cookie, or an HTML document, or..) is to use a validating parser.  Basically, when you get an HTTP request, parse it & put it in an object structured like the request.  At that time, you know the meaning of each portion of the string you are parsing, so you can interpret the bits correctly/safely.  The object(s) should store the individual strings that are actually content (vs. structure & constants) in a decoded state.  The developer should get everything from the objects, in decoded form, and put everything into the objects in decoded form.  Then, when it is time to send the response, the objects encode everything safely/canonically based on the exact type of objects they are.  This design concentrates the hard stuff (encoding, decoding, canonicalization, layering encodings on top of each other) near the interfaces, at the first/last possible moment enough context is known to interpret the information accurately.  It separates the mechanics of using a protocol or format from the intent of using the protocol.  It lets someone like me easily QA both the library and application code for security.  It is also simple for the developer to use safely (all the dev needs to think about is what objects/content they want to assemble, and the data validation at that layer is taken care of automatically) & is therefore the only design pattern I have seen consistently avoid all encoding-related vulnerabilities in the wild.  
>>> 
>>> So what does this mean?  Basically, from a security perspective, encoding & decoding methods should live in the objects they encode and decode, and never be called from outside code.  That is, there should be an HTTPHeader>>fromString: or fromStream: method, which is called from an HTTPResponse >>fromString: or fromStream: method, and no String>>decodeFromHTTP.   Adding a String>>decodeFromHTTP method is easy from the library maintainer's point of view, approximately correct (way more correct than no method at all), and it matches what most languages are doing these days, but it shifts the burden of all that thought about the specific HTTP header & context to the application developer, who is usually just trying to write an application, not learn every single detail of the HTTP & gazillion other standards he would need to do this safely.
>>> 
>>> Since this is a suggestion for substantial architecture change that would cause significant backwards compatibility issues throughout the entire Web application stack, and I'm new to Pharo to boot, I am expecting some interesting discussion to occur next.  Or maybe profound silence.  :)
>>> 
>>> In my back pocket somewhere amongst the code I am unmothballing, I have 95% of a thouroughly documented URI implementation and test suite that follows this pattern and is pedantically compliant with one or another of the URI RFCs (it's old, may not be the most recent).  I believe Spoon & Slate are using a previous version of it or its derivatives.  I'll need a fully pedantic HTTP parsing stack to feel comfortable releasing a P2P architecture security analysis tool (high value target, large attack surface, potentially very large professional embarrassment), so whatever isn't available, I expect we'll end up writing.  If Pharo folks are interested in this pattern, I would love to contribute my libraries/changes as I finish them, get advice on backward compatibility, performance, and APIs people would like to see, review whatever related code you'd like for security issues, and/or collaborate with any other developer who is interested.
>>> 
>>> Brenda
>>> 
>>> 
>>> On Jul 20, 2012, at 1:47 AM, Davide Varvello <varvello at yahoo.com> wrote:
>>> 
>>>> Good Stef, I opened a new feature as reminder here: http://code.google.com/p/pharo/issues/detail?id=6430
>>>>  
>>>> Davide
>>>> 
>>>> ----
>>>> - Cerchi un bravo Dentista, Avvocato, Commercialista? Un buon Hotel, Ristorante, Pizzeria? Io l'ho trovato su Oltre il Passaparola
>>>> 
>>>> - Blog: Cambia il Tempo
>>>> 
>>>> From: Stéphane Ducasse [via Smalltalk] <[hidden email]>
>>>> To: Davide Varvello <[hidden email]> 
>>>> Sent: Thursday, July 19, 2012 10:43 PM
>>>> Subject: Re: The opposite of encodeForHTTP
>>>> 
>>>> Let us fix it and propose a decodeFromHTTP method 
>>>> 
>>>> Stef 
>>>> 
>>>> On Jul 18, 2012, at 2:02 PM, Davide Varvello wrote: 
>>>> 
>>>> > Thanks Sven, 
>>>> > I was looking for String>>decode..whatever... with no luck :-) 
>>>> > Cheers 
>>>> > 
>>>> > -- 
>>>> > View this message in context: http://forum.world.st/The-opposite-of-encodeForHTTP-tp4640491p4640510.html
>>>> > Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com. 
>>>> > 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> If you reply to this email, your message will be added to the discussion below:
>>>> http://forum.world.st/The-opposite-of-encodeForHTTP-tp4640491p4640822.html
>>>> To unsubscribe from The opposite of encodeForHTTP, click here.
>>>> NAML
>>>> 
>>>> 
>>>> 
>>>> View this message in context: Re: The opposite of encodeForHTTP
>>>> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-users_lists.pharo.org/attachments/20120721/be06ca3f/attachment.html>


More information about the Pharo-users mailing list