Here you can find discussions and work from first Camp Smalltalk 2000 in Sain Diego, where Swazoo was born.
San Diego 1st Camp Smalltalk 2000
Why shouldn't we join forces and make one real good web application server in Smalltalk? It can behave as a standalone web server or together with server such as Apache. It should be portable to all Smalltalk dialects and all platforms. And it should join the Web and Smalltalk philosophy in as better way as possible. And it should be Open Source, product of group work by many interested parties around.
Organizer: David Farber
Swazoo team from left: Joseph Bacanskas, Ken Treis, Benny Sadeh, Janko Mivšek, David Farber, picture by Bob Hartwig
Ideas and discussion
10/17/2002 -- If you want to build something really great - simply clone The ObjectiveC version of WebObjects. I see a lot of not invented here syndrome in the Smalltalk community. There are, what, 20 some toy web servers writtin in Smalltalk? I just finished reading the web dev docs for VW7 and was extremely disappointed to see that they've mostly just cloned the lame java apis. TagLibs, embedded code pages, and the rest of the dreck. BFD. Embedding code in pages is a maintenance nightmare - especially wrt localization. Tablibs are only a tiny bit better. XML is JUNK quite frankly (now I know where the C++ standards committee people ended up) and I don't really care if XSL processing is avilable or not.
It seems that the web world has split into two factions - page people and object people. The page people believe that all can be acomplished by submitting a page to a server and having transformations performed on the page to produce a new page. I think this is unsupportable as an application architecture - despite its widespread popularity. The other view is that the browser is a GUI client and you keep a shadow copy of a conventional GUI on the server that happens to render by emitting html, is built up and torn down on every request, and every request is essentially a single event delivery to the page. This is In My Not Even A Little Humble Opinion the best way to do applications but it requires long term architectural investment and only WebObjects and its clone GnustepWeb have made the investment (and apparently VisualWave but it has this graphics console requirement that is just heinous and impractical).
The only reason I'm posting here BTW is I am desperate to find a viable alternative to WebObjects since Apple hosed it by porting it to Java and completely destabilized the product while making it much less flexible. I was hoping the Smalltalk community would have a similar architecture available and while VisualWave looks closest - the GUI requirement is a deal killer.
Rant over. Thanks for your time. Todd Blanchard
2/29 -- i'm pretty busy right now, but i will try to spend some time this weekend to organize this project and start to build a concensus on where we focus our efforts during Camp Smalltalk week. David Farber
For start I just released our own Web Application Server named AIDA/Web to open source. You can download and try it here: www.eranova.si/aida/index.htm. AIDA/Web is a product of 4 year development and use in Internet/Intranet production systems. It can be connected to Gemstone or Versant object database. Please read more on FAQ.
Recently I've been researching existing Apache modules that connect various systems and back end application engines to the Apache web server. I've come across FastCGI and I think it's the best and quickest way for the Smalltalk community to get Smalltalk and Apache (and other web servers) connected together. It also should result in a highly portable implemention since the source could ALL be written in Smalltalk code!
I think that the FastCGI protocol is simple enough that it could be imlemented at the next Camp Smalltalk (or even before if the community gets ambitious).
What is FastCGI?
"FastCGI is a language independent, scalable, open extension to CGI that provides high performance and persistence without the limitations of server specific APIs. ... FastCGI applications use TCP sockets to communicate with the web server. This scalable architecture allows applications to run on the same platform as the web server or on many machines scattered across an enterprise network. ... FastCGI applications are fast because they're persistent. There is no per-request startup and initialization overhead. This makes possible the development of applications which would otherwise be impractical within the CGI paradigm (e.g. a huge Perl script, or an application which requires a connection to one or more databases)."
Checkout the links at: FastCGI.
All the best,
Peter William Lount, Smalltalk.org, email@example.com
Other systems to consider are:
- Comanche(for Squeak)
- VisualWave (for VisualWorks)
- VisualWorks web server (and WikiWorks)
- The proposed project for Interfacing Apache to Smalltalk fits in with this as well.
- The Bytesmiths Toolkit contains a compact, high-performance web server, and is also in the public domain.
- FastCGI is an open source module that implements a high speed, scalable OS, web server and language independent module. Perfect for linking Smalltalk and web servers of all kinds.
We may well end up working with the Internet client/server framework folks, too. After talking to Janko, I think I understand where his project is headed -- and part of it follows the same pattern as Hydrogen. The internet framework folks seem to be leaning towards support of services, while this project looks to build web applications. Count me in. :) -- Ken Treis
Comments on an Ideal Web Application Server
You might also consider the HTTP Server built in to Server Smalltalk in VisualAge. It uses an interesting servlet capability to register services. It appears to be modeled after the "J word" idea. -- Jeff Odell
It seems that the core issue for a project like this is:
Can Smalltalk play a strategic role in web applications, especially as as a better business logic solution that the general scripting languages (Tcl, Python, etc.)?
Currently, I'm struggling with selecting the best technologies to bring our hospital applications to the web. In reading through the references Alan Knight provided on the Interfacing Apache to Smalltalk page (thanks Alan - very helpful), a successful strategy for many web sites are directories full of scripts, or HTML pages with embedded scripts, that form a web application. To me, it is a given that, with Smalltalk and it's tools, I can develop more complex solutions with obvious advantages over scripting languages. The quest then, can I implement and scale this solution?
For example, I already have a servlet, using IBM Smalltalk SST, the merges HTML Templates with special tags that invoke Smalltalk methods to generate specific custom HTML (poor man's "Smalltalk Server Pages?"). I was using this in the client in our Client Server app to display information in IE Active/X component. With minimal effort, this was up and running on a server. However, I'm struggling with address basic problems like:
- How do I update a servlet's implementation. Do we need a scripting solution? Use something like ICs (the IBM Smalltalk term for loadable/unloadable sets of classes)?
- How do I update the basic server code (image) in a 24x7 implementation?
- How do I integrate with static pages? Static pages are best served by a standard web server. So even though my Smalltalk image can do HTTP, it seems to not be the best/fastest solution. The Interfacing Apache to Smalltalk project might solve that nicely.
- For "business logic driven" pages, the Interfacing Apache to Smalltalk is debating between TCP/IP interface vs a call in. I see the performance benefits of the call in. I wonder how it deals with multiple incoming threads - this is simply outside my experience level at this point. If a TCP/IP interface was created, why not just have Smalltalk implement HTTP and integrate in the html via URLs?
- How do I scale this solution? Are non-blocking calls enough (I'm working with a framework and have achieved non-blocking calls to Oracle for persistence). One of the advantages of the scripting solutions is they run on difference OS threads/processes.
I believe that any good web application server will use HTML templates heavily. See photo.net/wtr for a very insightfull analysis of web architectures. It is a book and an MIT PhD thesis, but well worth reading. There are lots of systems with this architecture, including ASP, JSP, PHP, and Zope.
HTML templates are easy to implement in Smalltalk. We have implemented them for WIkiWorks by reading in the template, translating it into Smalltalk, and compiling it. Then each request is handled entirely by compiled Smalltalk code, not by an interpreter. We should be faster than interpreter based approaches like ASP, though some of the approaches (JSP, for example) are also compiler based.
I'm not sure compiling is actually the right approach. I'm also not sure it's wrong, but here's my argument. The bottlenecks with web servers are not generally raw speed, but scalability and ease of use. Scalability is much more related to memory consumption than to raw speed. Thus, high-end transaction systems generally do not cache data, but re-read it from the database every time, because that way they don't keep around unnecessary data in memory. If we have large numbers of pages, compiling them into memory consumes more memory than re-reading from files, and operating systems are typically very good at maintaining sophisticated memory caches of recently accessed files anyway. OK, we're talking about an awful lot of files before this becomes an issue, but the other driver is ease of use. One of the nice things about web development is that I hit save in my text editor, then refresh in my web browser and I see the new page. I do not want to have to go to my web server and tell it that some file may have changed, nor do I want to wait 5 minutes. A file-based approach also doesn't have to be purely interpreted. We implemented something (and it wasn't very hard) where you prase the HTML the first time to find where the script bits are, and remember those offsets in the file and the code that goes there. Then, when serving the page you just copy up to an offset, execute the snippet, append the printString of the result, repeat. -- Alan Knight
HTML template are necessary, but not sufficient. An HTML template usually creates some objects from existing classes and lets them do most of the work. You also need a way to load in new classes dynamically. This is easy in VisualWorks. WikiWorks has a "servlet" feature that loads parcels dynamically. Are ICs similar to parcels? Sort of, but it's probably trickier to load and unload them dynamically
Integrating static pages is easy. Use two servers! Or use one, and connect Smalltalk to the server. But any high-volume site will need more than one server anyway, so put the static pages on one set and the dynamic pages on others. This runs into other issues, including not wanting to expose anything but the basic web server outside the firewall. Note that if your database changes slowly enough, you might be better off by making most of your pages static and having the Smalltalk code handle changes to the database by generating new HTML and writing it out to the files that are used by the faster web server. I've seen this used on real projects very successfully, e.g. the Bank of Canada web pages for dormant bank accounts, which generates enormous numbers of static pages which are then indexed by a search engine. Unfortunately for Smalltalk, by the end the page generation was so simple they did it directly from the mainframe code -- Alan Knight
But I'd like to know how much faster Apache is at handling static pages then the VisualWorks webserver that WikiWorks uses. I wouldn't be surprised if it is faster, but I wouldn't be surprised if it isn't THAT much faster. Does anybody know a good benchmark? How do you measure web servers? Ralph Johnson
i doubt there is much of a speed difference, and if there was i'm sure you could bring WikiWorks closer to par with a little profiling; after all, it doesn't take much to open up a file and spew out the bytes. at least when i was deciding to integrate squeak with apache, i didn't see speed as a primary benefit. the thing i like about apache is that is has such a large user base; that translates directly into more nits being picked (like protocol compliance), more interesting features submitted by third parties, and quick turn around on bugs and security fixes. of course, if they would have just coded the thing in Smalltalk to begin with, bugs and security holes would have been less of an issue....:) David Farber
Um, I'm sensing a certain naivete here about the difficulties of building high-end systems. Sure it's not that difficult to open up a file and spew bytes. To do this high-performance we'll need (platform-dependent) memory-mapped IO, we'll need to spew bytes as fast as C, which is not something Smalltalk is noted for, we'll want to be able to exploit SMP boxes as well and as easily as Apache does, we'll need to worry about running multiple OS processes and coordinating them, etc. etc. For heavy loads, I would be very surprised if we can keep up with something like Apache. But discussing it like this is pointless. Let's measure it. Fortunately, the answer of how to measure web server performance has recently been answered by the public domain release of tools which have made the news. Download these hacker tools I keep reading about, get control of a group of machines (with the owner's permission, this time) and see how much it takes to bring down a Wiki vs. a pure Apache site . Ok, there are probably other free tools that are a little friendlier and more geared toward measurement than mayhem. Another interesting reference (again from Greenspun) is ArsDigita Server Architecture -- Alan Knight
Fair enough. Things that can handle reasonable-sized web sites already exist, though it would be nice to be able to bring them together, and have a good, standard templating facility that can compete with PHP, JSP, etc. I think we can do much better than reasonable-sized, though. I would hate to have to decide not to use Smalltalk for a project because it would break down once my web site became very high-traffic and would need to be rewritten in something. There's no reason that has to happen, but we have to be aware of some of the issues to avoid them. Another reason for interfacing to standard tools is precisely the mindshare and to avoid being seen as a tool that has to take over the entire web-server so that it can serve content to 5% of the pages. -- Alan Knight
I don't see any reason that a Smalltalk web server can't have good performance. Medusa is a high performance webserver written in Python. It is used as the basis of Zope's ZServer.
I agree that scalability and performance are issues which should be considered. And I don't see why Smalltalk can't perform and scale reasonably. Zope was able to go from a single server model to one which can have multiple Zope servers which act as clients to the ZODB (Zope Object DataBase). Because of following good OOD they were able to do this with a minimum of code. Now all they have to do to scale is add another machine, configure and go. They call this ZEO (Zope Enterprise Objects).
I think Smalltalk could be a wonderful platform for developing high performance, scalable websites with dynamic content.
A Smalltalk webserver may not perform like a Apache or AOLserver. I don't think it has to provided that it performs well and has a solution for scaling. I don't know that there is necessarily any webserver which will be able to handle any website on a single machine. If so, the machine will be prohibitively expensive. Maybe not a concern for some, but for others, well... For most solutions a means of scaling is required so that 'when' that time comes, you can. If a solution scales elegantly, well and transparently then it won't be a big concern.
Is the PWS mailing list appropriate for Comanche also? The archive list I've seen only goes to March. Is there another mailing list for Comanche?
Is there or will there be a mailing list for this project?
This sounds like it would be interesting especially if it worked with Squeak and was open source.
I can't tell from this wiki what progress has been made towards the development of a web solution for Smalltalk or Squeak.
Thanks for any pointers or comments.
Instead of developing an entire Application Server at once one could split it up in two layers:
- Webserver layer
- Model server layer
These two layers would communicate in XML, or more precisely, through the SAX interfaces, used locally or mediated by XML and HTTP. Inspiration may be drawn from the apache cocoon2 project (May also be another level for Interfacing Apache to Smalltalk). A GUI may also be used on top of the SAX interface on the Business model. An architecture like this is mapped out in Webtoko architecture.
The webserver layer would probably use XSLT to generate HTML if it needs to. The other way around is more complicated as that is where most of the user interface logic sits: the conversion of end user requests to queries and transactions for the model server, probably communicated through SOAP. An alternative may be Smalltalk Server Pages with scripts :-(
The model server would probably contain an Object/Relational Mapping framework like GLORP or business objects approached through some ODBMS-like query technique. Here another kind of mapping takes place between XML and objects. I agree with Roger Whitney that it would be interesting to be able to map between objects and both SOX schema and XML Schema, especially to process more complicated business documents. My project proposal Semantic Mapping may be an inpiration here: I think XML to object mapping is more of the same.
As you see, half of the Application server's components is already available! HOWEVER for Semantic Mapping one needs run time metadata to map to. In other words, a component model. VSE and VAST have limited component models for their visual composition editors. I once built a more powerfull component model for Business Objects, but is there something like an ANSI Smalltalk component model - standardized and portable? Or maybe one could just use an existing XML/SOX/RDF Schema implementation to describe (constraints on) Smalltalk objects? Or can we simply reuse the (mapped) metadata proposed on Object/Relational Mapping? -- Henk Verhoeven