Personal tools
You are here: Home Knowledge Techie Central OpenACS presentation master.txt
Document Actions

master.txt

Click here to get the file

Size 12.7 kB - File type text/plain

File contents

1.  OpenACS Title Slide

2.  What is a Web Community?

  * Sites like slashdot.org, yahoo! groups, imdb.com, and even
    amazon.com (has community features), blogspot.com

  * Not web communities: most ecommerce sites, most sites advertising
    a company, service or band.

  * common features: bulletin boards (bboards), news, comments, user
    submitted stories

  * advantages: building a web community creates interest and
    publicity in a sideways manner.  Site is useful besides
    advertising.  Shares knowledge, reduces need for organization
    to produce all the content.

  * disadvantages: requires programmers and maintainers.  Static sites
    can be run w/ almost no thought besides some basic UI design and
    the use of Dreamweaver etc.

Needs of a web community:

   1. magnet content authored by experts 

   2. means of collaboration  (bboard, comments, etc)

   3. powerful facilities for browsing and searching both magnet
      content and contributed content  (site-wide search)

   4. means of delegation of moderation (filters to block posters,
      content rating)

   #5. means of identifying members who are imposing an undue burden on
   #  the community and ways of changing their behavior and/or
   #  excluding them from the community without them realizing it  (bozo-filter)

   6. means of software extension by community members themselves
      (open source)

3.  Who wrote OpenACS, who uses it, and why is it open source?

   * Started by arsDigita, later taken over by the OpenACS gang.

   Used by:

       *  Development Gateway (WorldBank) www.developmentgateway.org - ACS

       * Knowledge Management System for Siemens
          Corporation. (intranet application) - OpenACS/ACS hybrid
  
       *  Deutsche Bank Intranet - ACS

       * site59.com: Last Minute Travel site (www.site59.com) - ACS

       * scorecard.org: Environmental site which at one point served
                          30 db-backed page hits a second on an old Sun
                          Pizza Box (Sun UltraSparc II) proto-ACS

       * photo.net: Community Site for camera enthusiasts serves
                     hundreds of thousands of hits a day.
                     (www.photo.net) - ACS

    * Software companies make most of their money via services not
      licenses.  In the web world this is especially the case.  Reduces
      development costs, gains free publicity, gains free bug fixes and
      packages.
      
4.  History of OpenACS

    * Philip Greenspun, Ben Adida, and crew wrote a website for Hearst Publications
      in the mid-90s.

    * Used Illustra database, moved to ORACLE as ORACLE was a much better database.

    * Philip founded aD to build and market the ACS

    * in the process of building aD convinced AOL to open source AOLserver 

    * PostgreSQL came out and was a full featured open source database

    * aD gets VC money

    * Ben Adida and some others started to port the ACS to PostgreSQL
      to make it built entirely on an open source platform.
  
    * aD decides to totally rewrite the ACS

    * 4.0 is released in mid-2000

    * arsDigita decides to become more "market savvy" and move away from TCL
      to Java

    * VC appointed CEO starts to run company like a dot-com

    * Philip tries to take back over the company
    
    * OpenACS crew ports ACS 4.0

    * VC's spend most of remaining capital paying off Philip

    * aD goes under and is bought out by RedHat

    * OpenACS is used by many small OSS companies to work on lots of
      projects, one of two or three major community systems (OSS).

6.  OpenACS 3-tiered Architecture (Diagram)


Browser  <-          -> WebServer      <-         -> Database (Data Model & Storage)
 viewer                   application logic


**** Outline general use cases .. multiple users accessing website at same time.

6. What is a database?

*  Method of storing, organizing and rapidly retrieving data

* Robust to multiple writes and reads at the same time

* Through the '70s mostly hierarchical databases (file-system on steroids)

* HDBS were not robust to changing data models

* Born the relational database

* Basically a bunch of spreadsheets (columns and rows) with a declarative
  language (SQL) used to retrieve the data

7.  Responsibilities of Postgres
+
8.  PostGres vs. File System -- ACID Fundamentals

    * ACID section

    * efficient retrieval of data (Million row file, searching for one
      row, compounded when crossed w/ another million row file to coordinate
      the search) indexes

    * event listeners (triggers)

    * good system for coordinating data retrieval (joins)

      store information about the user in one table
      store information about the user's purchases in another table
      easily find out who bought pants on Oct-21st

    * more overhead on writes and reads
      maintaining indices etc.

    * embedded procedural language for performing common tasks inside
      the database.

9.  Webserver Layer

    * what is a webserver

    * HTTP .. simple open protocol for Client-Server

    * anatomy of a standard page

       * some static, some dynamic, some database dynamic content.

10.  AolServer vs. Apache

    * why aolserver was used instead of apache

12.  TCL -> why TCL?

    * Toy language

    * weak on data structures (only the list and associative array)

    * not buzzword compliant

    * weaker on heavy infrastructure if not used carefully

    * slow 

    * turing complete

    * satisfies 90% of website's needs (Vignette storyserver uses it too and they
      charge 10's of thousands of $'s -- used to)

    * rapid development .. can develop sites in much less time then Java

    * on the web everything is a string .. but your fundamental data isn't

13.  AOLServer Native Services
     1.  Database API & pooling

       * ns_db api
  
       * pooling vs. new connections
 
       * no database swamping

     2.  Filters

       * violates one URL - one file

       * can be used for authorization or redirection

       * invisible to developers so can stack 3 million of them
         slowing requests and not realize it.

     3.  Templating
 
        * ADP's to mix TCL and HTML code

        * scares HTML-monkeys

     4.  Connection API

        * Unified way to get basic information about requests and 
          the client.  Only based on client .. not on information special
          to the system.

          ns_conn url

14.  Why they are insufficient?

15.  3.x vs. 4.x

        * Flat structure (use examples from documents)

        * good for single look feel websites, monolithic structure

        * everything installed in one batch .. 

        * services tended not to be autonomous .. 

        * pile of code .. not well designed

        vs.

        * strong on infrastructure

        * packages allow real separation of functionality, tendency to design
          more reusable components

        * didn't have to install everything

        * good for monolithic and multi-purpose deployment

Sections:


Database Services

1. Data Model  (Compared to Vignette)

   * Vignette had some basic utilities and a v. basic data model which
     was insufficient for building a Vignette site.  You ended up having
     to write a lot of your data model while building on it.

   * OACS has strong data model for site-wide services.  Data
     modelling is a major portion of site-design.  Data model tested
     in a wide variety of situations so it tends to be pretty robust.

   * Data model is easily extensible .. the integration w/ the database
     is tight so it is easy to optimize.  (See database independence)


2. Database API (modifications, example of advantage of TCL .. show Java code)

   * db_1row

3. Basic Object Interface

  * All things which require site-wide services are an extension of
    ACS_OBJECTS

  * 


4. Database Functions

5. "Database Independence"

6. XQL

Website Structure

0. 1 URL = 1 File

1. Packages (directory structure slide)

2. Package Instances

3. Site Map & Site Nodes


Request Processor

0. Why it exists

1. Anatomy of a request

2. How it handles a request 

3. Templating  (SLIDE?)

3a.  ad_page_contract, adp's vs. html::template

  * looping
 
  * conditional logic (if-then)

  * includes

  * reverse-includes (master)

4. Subsites 


Permissions:

1. Problem defined

1aa. users, objects, privileges

1ab.  Users and Persons

3. users, parties, groups

4. contexts

5. API .. what it gives you

6.  Utter Failure

* doesn't scale

* doesn't meet needs

Security

1. Basic Problem / Security Scenarios

  1.  Packet Sniffer

  2.  Left computer on (browser history, showing on screen, etc.)

  3.  Hacker/Defecting DB Admin

2. HTTP vs. HTTPS

2a. ad_secure_conn_p

2b. HTTP authorization code is insecure

3. Passwords, emails, one-way hashes

4. Authorization/Authentication

5. How to steal an identity

6. Always check your passwords

7. Don't store data

8. 2 signs that a website should not be trusted ..


Self-Documenting Server:

1. ad_proc, ad_library, ad_whatever

ad_proc -flags (which are pretty meaningless last time i checked) { args } {

javadoc style @info

} {
  code ....
}

ad_library { 

javadoc style @info

}

stores data in memory array and you can read the documentation through
the /doc interface.


A Typical ACS Page:

1. Database hits

2. ad_page_contract

3. template


Cache (Poorly Done problem):

1. Memory Caching

  * It's fast, w/ AOLServer it is easy to share information between
    threads since there share a memory address space.

  * Causes memory usage to increase, if caches are commonly used and
    never purged they may result in RAM being used up and then going
    to SWAP space which slows down every action on the system.

  * In a multiple front end server environment there may be cache
    inconsistency.  There is no efficient mechanism to update the
    caches on each of the servers.  Someone may reload the same
    page 4 times and see 4 different results.

  * Cache does not persist between server reboots (depending on
    stability of system this may not be a major concern but
    wait until you are slashdotted).

2. Database Caching

  * Works between multiple front ends.

  * Consistent between reboots.

  * More expensive to write and read.

  * With a massive # of front ends with replicated databases you
    will have cache inconsistency again.

3. Squid Caching

  * Great for mostly static content

  * SQUID can act as a proxy/load balancer and can cache oft requested
    pages which don't change in memory and not even forward the
    requests to the webservers.

  * Tiny variations like, "Welcome Tristan" instead of "Welcome Armen"
    can stop the page from being cached.

4. Amazon/Google-Style redirect caching

  * Probably the best solution in massive deployments.

  * A user is redirected to the same server over and over again.

  * Google has special indexes based on search terms so you are always
    directed to a machine which is specially tuned for your search
    criterion.

  * All the advantages of memory caching and squid caching without the
    problems of cache inconsistency.

  * Resolve memory leaks by having the cache flush old unused data.

5. Since OpenACS was designed for deployment with one to a couple of
   front ends in mind it focuses on memory caching.  util_memoize
   stores data in a set of key value pairs with a timestamp.  the
   oldest data is flushed as memory usage grows above a certain
   amount.  Database caching is easy to implement.


    
5.  OpenACS vs. Zope vs. Roll Your Own

    * OACS - Tightly integrated w/ the database

    * Zope - Uses custom object database for many parts, can also run
             on top of a RDBMS.
    --

    * OACS - standard site-wide method of handling users, permissions,
             site-wide search, templating, packaging, site-maps

    * Zope - ditto .. may run into trouble when concepts need to exist
             in two places .. like users.  

    --

    * OACS - most work done in editor of choice on top of OS of choice

    * Zope - lots of work done in browser interface ..

    -- 
 
    * OACS - non-simplistic install, highly customizable

    * Zope - easy to install, less obviously customizable

    --

    * OACS - depending on level of customization upgrading may be painful if it
             involves changes to the database
  
    * Zope - Probably easier to upgrade

    --

    * OACS - TCL, weak on data structures, simple to learn and
             implement in, lots of custom constructs inside of OACS
             designed to accelerate development.

    * Zope - Python, strong on data structures, excellent language ..
        
             DTML, semi-programming language w/ HTML-like syntax
        
             Python is famous for being a compact and simple language,
             in it's documentation Zope proudly (and
             prob. incorrectly) indicates that it ignores the benefits
             of Python.
by admin last modified 2003-04-28 10:42
 


View My Stats