SiSU -->
[ document manifest ]
<< previous TOC next >>
< ^ >

SiSU - SiSU information Structuring Universe - Structured information, Serialized Units,
Ralph Amissah

Structured information, Serialized Units

SiSU - from less markup than the most elementary equivalent html, you can have more

1. Description

1.1 Outline
1.2 Short summary of features
1.3 How it works
1.4 Simple markup
1.4.1 Sparse markup requirement, try to get the most out of markup
1.4.2 Single markup file provides multiple output formats
1.4.3 Syntax relatively easy to read and remember
1.4.4 Kept simple by having a limited publishing feature set, and features identified as most important, are available across several document types
1.5 Designed with usability in mind
1.6 Code separate from content
1.7 Object citation numbering, a text or object positioning / citation system - "paragraph" (or text object) numbering, that remains same and usable across all output formats by people and machine
1.8 Handling of Dublin Core meta-tags making use of the Resource Description Framework
1.9 Easy directory management
1.10 Document Version Control Information
1.11 Table of contents
1.12 Auto-numbering of headings
1.13 Numbering and cross-hyperlinking of endnotes
1.14 "Skinnable"
1.15 Multiple Outputs
1.15.1 html - several presentations: full length & segmented; css & table based
1.15.2 EPUB
1.15.3 XML
1.15.4 ODT:ODF, Open Document Format - ISO/IEC 26300:2006
1.15.5 PDF - portrait and landscape, (through the generation of LaTeX output which is then transformed to pdf)
1.15.6 Search - loading/populating of relational database while retaining document structure information, object citation numbering and other features (currently PostgreSQL and/or SQLite)
1.15.7 Search - database frontend sample, utilising database and SiSU features, including object citation numbering (backend currently PostgreSQL)
1.15.8 Other forms
1.16 Concordance / Word Map or rudimentary index
1.17 Managed (document) directory, database, or site structure
1.18 Batch processing
1.19 Integration to superior Gnu/Linux and Unix tools
1.19.1 Backup and version control
1.19.2 Editor support
1.20 Modular design, need something new add a module

2. Markup and Output Examples

2.1 Markup examples
2.2 A few book (and other) examples
2.2.1 "Viral Spiral", David Bollier
"The Wealth of Networks", Yochai Benkler
"Two Bits", Christopher Kelty
"Free Culture", Lawrence Lessig
"CONTENT", Cory Doctorow
"Democratizing Innovation", by Eric von Hippel
"Free as in Freedom: Richard Stallman's Crusade for Free Software", by Sam Williams
"Free For All: How Linux and the Free Software Movement Undercut the High Tech Titans", by Peter Wayner
"The Cathedral and the Bazaar", by Eric S. Raymond
"Down and out in the Magic Kingdom", Cory Doctorow
"Little Brother", Cory Doctorow
"For the Win", Cory Doctorow
"Accelerando", Charles Stross
"Tainaron", Leena Krohn
"Sphinx or Robot", Leena Krohn
"War and Peace", Leo Tolstoy, PG Etext 2600
"Don Quixote", Miguel de Cervantes [Saavedra], translated by John Ormsby, PG Etext 996
"Gulliver's Travels", Jonathan Swift, transcribed from the 1892 George Bell and Sons edition by David Price, PG Etext 829
"Alice's Adventures in Wonderland", Lewis Carroll, PG Etext 11
"Through The Looking-Glass", Lewis Carroll, PG Etext 12
"Alice's Adventures in Wonderland" and "Through The Looking-Glass", Lewis Carroll, PG Etexts 11 and 12
"Gnu Public License 2", (GPL 2) Free Software Foundation
"Gnu Public License v3 - Third discussion draft", (GPLv3) Free Software Foundation
"Debian Social Contract"
"Debian Constitution v1.3", (simple/default markup)
"Debian Constitution v1.3", (markup adjusted for output to more closely match the original)
"Debian Constitution v1.2", (simple/default markup)
"Debian Constitution v1.2", (markup adjusted for output to more closely match the original)
"A Uniform Sales Terminology", Vikki Rogers and Albert Kritzer
"The Autonomous Contract" 1997 - markup sample
"The Autonomous Contract Revisited" - markup sample
"United Nations Convention on Contracts for the International Sale of Goods"
/PECL/ the "Principles of European Contract Law"
2.3 SQL - PostgreSQL, SQLite
2.4 Lex Mercatoria as an example
2.5 For good measure the markup for a document with lots of (simple) tables
2.6 And a link to the output of a reported case

3. A Checklist of Output Features

4. Introduction to SiSU Markup  114 

4.1 Summary
4.2 Markup Examples
4.2.1 Online
4.2.2 Installed

5. Markup of Headers

5.1 Sample Header
5.2 Available Headers

6. Markup of Substantive Text

6.1 Heading Levels
6.2 Font Attributes
6.3 Indentation and bullets
6.4 Footnotes / Endnotes
6.5 Links
6.5.1 Naked URLs within text, dealing with urls
6.5.2 Linking Text
6.5.3 Linking Images
6.6 Grouped Text
6.6.1 Tables
6.6.2 Poem
6.6.3 Group
6.6.4 Code
6.7 Book index

7. Composite documents markup

Markup Syntax History

8. Notes related to Files-types and Markup Syntax

9. Commands Summary

9.1 Description
9.2 Document Processing Command Flags

10. command line modifiers

11. database commands

12. Shortcuts, Shorthand for multiple flags

12.1 Command Line with Flags - Batch Processing

Technical Information

13. Technical notes

13.1 See abandoned U.S. Provisional Patent Application

14. Diagram / Chart

14.1 The Chart
14.2 I/O
14.3 The Program
14.4 Software utilised
14.4.1 SiSU
14.4.2 SiSU Modules

15. SiSU development environment and technologies of interest, including data formats

15.1 Development environment, Debian
15.2 Programming language, Ruby
15.3 SGML & XML Family
15.3.1 SGML
15.3.2 XML Family
15.4 TeX Family
15.5 Pdf
15.6 Relational Databases, SQL
15.7 Other Databases
15.8 Text Search
15.9 Character Encoding, Unicode
15.10 Information Visualization
15.11 Metadata - semantic
15.12 Syndication, Web feed formats
15.13 Other
15.14 Editors
15.15 Version Control
15.16 Licenses

A Summary of notable events

16. A history of SiSU and its outputs including search

A Chronological history of developments on SiSU

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

January
February
March
April
June
July
August
September
November
December

2004

January
February
March
April
May
June
July
August
September
October
November
December

2005

January
February
March
April
May
June
July
August
September
October
November
December

2006

January
February
March
April
May
June
July
August
September
October
November
December

2007

January
February
March
April
May
June
July
August
September
November
December

2008

January
February
April
June
September
October
November
December

2009

January
December

2010

March

2010

March

FAQ, Howto, Installation, etc.

HowTo

17. Getting Help

17.1 SiSU "man" pages
17.2 SiSU built-in help
17.3 Command Line with Flags - Batch Processing

18. Setup, initialisation

18.1 initialise output directory
18.1.1 Use of search functionality, an example using sqlite
18.2 misc
18.2.1 url for output files -u -U
18.2.2 toggle screen color
18.2.3 verbose mode
18.2.4 quiet mode
18.2.5 maintenance mode intermediate files kept -M
18.2.6 start the webrick server
18.3 remote placement of output

19. Configuration Files

20. Markup

20.1 Headers
20.2 Font Face
20.2.1 Bold
20.2.2 Italics
20.2.3 Underscore
20.2.4 Strikethrough
20.3 Endnotes
20.4 Links
20.5 Number Titles
20.6 Line operations
20.7 Tables
20.8 Grouped Text
20.9 Composite Document

21. Change Appearance

21.1 Skins
21.2 CSS

Extracts from the README

22. README

22.1 Online Information, places to look
22.2 Installation
22.2.1 Debian
22.2.2 RPM
22.2.3 Source package .tgz
22.2.4 to use setup.rb
22.2.5 to use install (prapared with "Rake")
22.2.6 to use install (prapared with "Rant")
22.3 Dependencies
22.4 Quick start
22.5 Configuration files
22.6 Use General Overview
22.7 Help
22.8 Directory Structure
22.9 Configuration File
22.10 Markup
22.11 Additional Things
22.12 License
22.13 SiSU Standard

Extracts from man 8 sisu

23. Post Installation Setup

23.1 Post Installation Setup - Quick start
23.2 Document markup directory
23.2.1 Configuration files
23.2.2 Debian INSTALLATION Note
23.2.3 Document Resource Configuration
23.2.4 Skins

24. FAQ - Frequently Asked/Answered Questions

24.1 Why are urls produced with the -v (and -u) flag that point to a web server on port 8081 ?
24.2 I cannot find my output, where is it?
24.3 I do not get any pdf output, why?
24.4 Where is the latex (or some other interim) output?
24.5 Why isn't SiSU markup XML
24.6 LaTeX claims to be a document preparation system for high-quality typesetting. Can the same be said about SiSU?
24.7 Can the SiSU markup be used to prepare for a LaTex automatic building of an index to the work?
24.8 Can the conversion from SiSU to LaTeX be modified if we have special needs for the LaTeX, or do we need to modify the LaTeX manually?
24.9 How do I create GIN or GiST index in Postgresql for use in SiSU
24.10 Are there some examples of using Ferret Search with a SiSU repository?
Have you had any reports of building SiSU from tar on Mac OS 10.4?
24.12 Where is version 1?
24.13 What is the difference between version 1 and 2?

Installation

25. Installation

25.1 Debian
25.2 Other Unix / Linux
25.2.1 source tarball

26. SiSU Components, Dependencies and Notes

26.1 sisu
26.2 sisu-complete
26.3 sisu-examples
26.4 sisu-pdf
26.5 sisu-postgresql
26.6 sisu-remote
26.7 sisu-sqlite

27. Quickstart - Getting Started Howto

27.1 Installation
27.1.1 Debian Installation
27.1.2 RPM Installation
27.1.3 Installation from source
27.2 Testing SiSU, generating output
27.2.1 basic text, plaintext, html, XML, ODF, EPUB
27.2.2 LaTeX / pdf
27.2.3 relational database - postgresql, sqlite
27.3 Getting Help
27.3.1 The man pages
27.3.2 Built in help
27.3.3 The home page
27.4 Markup Samples

28. SiSU Components, Dependencies and Notes

29. Breakage and Fixes

31st October 2006 - SiSU < 0.48.3 break against Ruby > 1.8.5-3, break on cyclic include; Fixed SiSU: >=0.48.3 (see notes)
21st September 2005 - Avoid ruby-1.8.3 (2005-09-21) and (2005-10-12), Ruby Segfaults; Fixed: later versions of Ruby (see notes)

License, Standard

30. License

31. Things SiSU Standard

Download information

Download information

32. Download SiSU - Linux/Unix

SiSU Current Version - Linux/Unix
Source (tarball tar.gz)
Git (source control management)
Debian
RPM

Changelog - sisu

33. SiSU Version Manifest / changelog

Current version
3.0
Previous versions
2.7
2.6
2.5
2.4
2.3
2.2
2.1
2.0
1.0
0.71
0.70
0.69
0.68
0.67
0.66
0.65
0.64
0.63
0.62
0.61
0.60
0.59
0.58
0.57
0.56
0.55
0.54
0.53
0.52
0.51
0.50
0.49
0.48
0.47
0.46
0.45
0.44
0.43
0.42
0.41
0.40
0.39
0.38
0.37
0.36
0.35
0.34
0.33
0.32
0.31
0.30
0.29
0.28
0.27
0.26
0.25
0.24
0.23
0.22
0.21
0.20
0.18
0.16
0.14
0.12
0.10
0.8
0.6
0.4
0.2
0.1
Release

Changelog - sisu-markup-samples

34. Version Manifest / changelog - SiSU Markup Samples

Current version
2.0
1.1
1.0

Method for providing digital documents including a common citation structure

[SiSU Provisional Patent Application of 2004 based on much older idea and work on SiSU, Abandoned]

The 'Invention' described (and diagrams) by Ralph Amissah.
Provisional patent application text prepared by Stephan Filipek of Winston & Strawn LLP

35. 1. Background

36. 2. Definitions

37. 3. Brief Descriptions of the Drawings

38. 4. Detailed Description of the Preferred Embodiments

39. 5. Document Processing, examples of subsequent steps

40. 6. Advantages of the Invention

41. 7. THE CLAIMS

Post Filing Appendix

42. Post Filing Appendix: Reasons for Abandonment of Patent Process Claim

Endnotes

Endnotes

Metadata

SiSU Metadata, document information

Manifest

SiSU Manifest, alternative outputs etc.

SiSU - SiSU information Structuring Universe - Structured information, Serialized Units,
Ralph Amissah

Structured information, Serialized Units

A Chronological history of developments on SiSU

1999

February 1999 Decision made Gnu/Linux identified as the most attractive way forward. Perl works as it should on the platform. I have had a good time with NT but it is resource hungry. (more recently I hear MS has plans to do something to address its shortcomings in the Perl department).  289 


a better way

a better way

March 1999 Lex Mercatoria site down. Critical hard disk failure. Have been working on a new site - all texts being generated by Perl scripts, which greatly improve the ease of maintenance. A trip to Norway is called for. Question is whether to get the old site back up, or push on to have the new site ready as soon as possible.

8th March 1999 Ralph Amissah - made a Fellow of the Institute of International Commercial Law, School of Law, Pace University, White Plains, NY, USA

17th May 1999 New site is ready, planned hosting in Norway and the US as detailed in the credits at the bottom of the pages.

More efficient techniques used in creating the site.

May 27-29, 1999 Lex Mercatoria back on the air and grateful to the Law Faculty of the University of Oslo for hosting the site. Somewhat streamlined, possibly slightly smaller than we were and for the time being, but technically superior to anything that we have been (construction of the site is fully automated with only one page being manually constructed) and with the potential to become better yet. At this time the home page is the only manually generated page on the site, which is once again hosted on a UNIX platform (Sun Solaris running Apache) which happens to be what the University of Oslo uses.

May Scripts (numbering system etc.) have been used at the request of Albert Kritzer and Richard Hainebach to produce a Kluwer text Uniform Law for International Sales, Sales under the 1980 United Nations Convention, Third Edition by John O. Honnold, Schnader Professor of Commercial Law Emeritus University of Pennsylvania, Secretary, UNCITRAL, and Chief, U.N. International Trade Law Branch, 1969 - 1974, Kluwer Law International. Also made kindly made available by Kluwer for testing of scripts International Project Finance by Hoffman. At some point prepared content from the Trade Law Project (prepared by our scripts) is noticed within the Kluwer Arbitration site, did not have a problem with this, but the direction of content flow should remain clear.

2nd June 1999 LexMercatoria regenerated with first set of "bugs" cleared most documents should now have titles, which are required for meaningful query results from the search engine. (any fresh bugs will be corrected in next update).

14th July 1999 There has been quite an extensive update of the site though much remains to be done. For a trial period of three weeks we will try to wean you off our old home page and trust you will be able to find your way about our new one. If your browser supports redirection, you will be redirected to the auto-generated page one minute after the old home page has been fully loaded. Unless there is good reason to reconsider we are likely to phase out the old home page, in time.

Download times for the site would speed up considerably if we dropped the use of tables on long documents, and we are considering this. This is particularly noticeable if you (like myself at present) are not amongst the privileged with broadband Net access. There are bound to be a few bugs. Not all files have yet been transferred from the old site to the new, though the new site contains a more up to date set of documents. Our old file system was insensitive to case, the new file system is case sensitive, some links may not yet be fully compliant. Patience, these and any other issues will be addressed.

6th December 1999 Another new interface for the site is under test, the result of another generation of improvement in our site building tools (collectively fondly nicknamed SiSU). Information on the text presentations and navigation is available  290  . There is much greater consistency in presentation and viewing should have been enhanced and (for most part) made faster, across most graphical browsers and platforms. What we unfortunately do not provide examples of and so you will not see is that it is particularly well suited to the electronic publication of books, and has been tested on several legal academic and practitioners texts of over 500 pages. In parts of the site there are likely to be some "bugs", these however bad they look, should from a technical standpoint be minor to correct.

Status as of year end 1999 The document providing information on the text presentations and navigation contains a summary of the year from that perspective which is copied below:

The site has undergone a facelift for the Millennium, but in most respects our focus with regard to the presentation of documents has remained the same. We hope it results in an improved user experience.

In 1993 we boldly set out amongst other things:

"To explore, utilize and demonstrate the potential of the new IT mediums insofar as they pertain to our chosen subject area."

We have largely achieved this goal in demonstrating how various complicated legal (and other) documents of different content, structures and sizes can be can be presented on the Net using simple html.

If we have been limited in the possibilities that we have explored and utilized, our path has been selected by figuring out what could be achieved most effectively/ successfully with limited resources. We have stuck to a few basic tools and rules of thumb, and have gained considerable experience in: getting the most out of the basic text markup language of the Web, html, without frills; efficient site management; the selection and effective use of basic tools (an editor, markup languages, scripting languages); and how to efficiently maintain cross platform (server and browser) compatibility in our product, through the selection and careful use of inter-operable and preferably open standards, and focus of effort on (few of) what we determine to be key complementary technologies. Our approach has been to identify simple, effective and efficient tools and solutions and to get the most out of them. In effect we have been exploring what can be made of technologies that are available to anyone on the Net. We have also kept an eye on other IT technologies that we do not necessarily use but provide for your perusal and benefit through the maintenance of an information technology compendium.

In the construction of this site our primary focus has remained since the outset (1993) been on presenting texts using html in a convenient manner. It has in part represented an experiment in how best this might be done for our purposes. The results remain as good as can be found anywhere for publications using html 4.0.

Our aim has been to be able to provide and create and maintain efficiently high quality usable presentations of texts (legal, academic, practitioner's, & including conventions, rules, contracts) whilst avoiding unnecessary complexity, indeed, so far it has been achieved using the most basic of markup languages on the Net, plain html with the help of Perl scripts  291  for its transformation from ascii.

Our 1996 list of design criterion for text presentations has now been met and implemented consistently throughout the site [though a few bugs may still remain]. Whilst most individual requirements set were met as early as 1997, presentations have been continuously improved upon. The rationalisation of how best to achieve consistent presentation across various types of text, and its implementation is a feature of the 1999.  292  An idea of these criterion may be gleaned from the contents of this document.

The year's changes improve the site and to provide greater utility from text presentations, including: greater consistency between different types of presentation; improved navigation of the site and individual texts; faster loading and better rendition of texts across different types of browser, the main ones we support being Opera, Internet Explorer, Netscape Navigator, (and we expect Konquerer).

The programs that generate the site have been tested on several books (academic and practitioner's texts) of over 500 pages, and the results are particularly well suited for their electronic presentation. The text navigation and presentation features (generated by the site generation program) come to their own on these longer texts, in which it is easy to appreciate the utility of the resulting document presentations.

So on the technical front we are now, in a sense, free to set new goals, and indeed may look in a number of additional directions. The site has concentrated on making the most of html presentations across most modern browsers, and without making concession to having different presentations for different types of browser. In future we may also present texts as in RTF and possibly pdf, but our primary additional focus will be on XML and we will look at xhtml. /PHP/ being open source and designed for cross-platform functionality is of interest. We may if requested go back to having (in addition) html presentations without our paragraph numbering. In mentioning these possibilities we perhaps run a bit ahead of ourselves, as far as this text is concerned.

Introduce a navigation page describing how to use the auto-generated pages on Lex Mercatoria ‹http://www.jus.uio.no/lm/navigation/doc.html

"Always remembering that we remain a small unit and will continue to do what we can."




 289. This proved to be one of the best decisions I made in technology, that Gnu/Linux is more stable, less resource hungry was immediately evident, but that it gives you so much room to grow and take advantage of its rich offerings ... well, ... you learn with time, at least I did. Gnu/Linux offers better performance, reliability, scalability, security and total cost of ownership. And generally a more powerful and flexible environment in which to invest technology skills. My opinion of course.

 290.http://www.jus.uio.no/lm/navigation/toc.html

 291. We have a suit of in house Perl scripts (collectively called SiSU) that are used to generate the site from ascii with the minimum initial markup required to enable the possibility of generation of the texts on this site in html. We will build upon these scripts and the ascii files in further developing the site and approaching other forms of text markup.

 292. This in part spurred on by the disk failure of March 1999 and the need to further improve the efficiency of site maintenance, fortunately the requirement of this need was complementary to that of achieving greater consistency in the presentation of texts.


[ document manifest ]
<< previous TOC next >>
< ^ >



SiSU


Viral Spiral - How the Commoners Built a Digital Republic of Their Own

David Bollier

2009


The Wealth of Networks - How Social Production Transforms Markets and Freedom

Yochai Benkler

2006


Free Culture - How Big Media Uses Technology and the Law to Lock Down Culture and Control Creativity

Lawrence Lessig

2004


CONTENT - Selected Essays on Technology, Creativity, Copyright and the Future of the Future

Cory Doctorow

2008


Democratizing Innovation

Eric von Hippel

2005


Free As In Freedom - Richard Stallman's Crusade for Free Software

Sam Williams

2002


Two Bits - The Cultural Significance of Free Software

Christopher Kelty

2008


Free For All - How Linux and the Free Software Movement Undercut the High Tech Titans

Peter Wayner

2002


The Cathedral & the Bazaar - Musings on Linux and Open Source by an Accidental Revolutionary

Erik S. Raymond

1999


Little Brother

Cory Doctorow

2008


Down and Out in the Magic Kingdom

Cory Doctorow

2003


For the Win

Cory Doctorow

2008


Free Software Foundation - FSF