Friday, May 26, 2006

banyanTree project

The banyanTree Project

Introduction:
The banyanTree project is intended to be the central resource to bring volunteers (interested professionals, students, programmers, language translators, etc. or even end users) and FOSS projects together. The idea is very natural : a common meeting place for people new to FOSS, so that they can get to their project of interest easily. The interested person are assumed to have some skill so that they are looking to help in a particualr fied, but doesnt know which project/org needs such skill at the moment. Some may know which org to join and have some needed skills, but may need some other related help. It is also intended to become the central "guide" for the newcomer. Think of banyanTree as your tour guide : getting you around the city and describing the needed details.

History:
Two years back when I was trying to get involved in a project related to Flash, I had to do a lot of roaming around. I found GPLFlash and many other projects, but it took time. The main reason for this is the fact that there isnt a single place where all interested organisations or projects come together and speak out their details regarding both development and usage of the software. I would like to term that as a "guide" to FOSS-world. I tried to think hard on implementing this, but for many reasons I could not. This year (2006), while I was applying for Google Summer of Code, I suddenly discovered a place to discuss my thoughts. Thus we now have banyanTree. The rest is what you see now.

What we are to do:
banyanTree is a combination of a website and other communition medium like IRC. A team of developers will have to build the site, and also maintain it. It is basically a community updates tracking system and a search engine which is specific for searching the depths of FOSS.
The updates tracking system : project maintainers, update their information regarding developers or volunteers required, project introduction and detail URLs, mentors available, sponsoring (if any), etc. The details will be kept in a central database, and also mirrored on the project's sites if needed. Any visitor has access to these details at any moment. The maintanance of the details is the most important part : keeping things up-to-date, notifying interested parties or individuals, keeping trac of related happenings in the industry which may benefit the volunteer/visitor. The key point is the distributed architecture allowing project maintainers, mentors to update information to central database.
The search engine : this allows extensive search of the information in the databases. It could also grow to become a FOSS specific search engine, which includes any FOSS related topic.

The name banyanTree:
The reason for the name come from a mixed feeling and can not be easily expressed in words, but I have tried to put it up in plain English here:
The banyan tree is a type of a large tree, with aerial roots which hang from the branches. The trees thus can spread very large areas. This type of tree is mostly found in Indian sub-continent. They grow together into a very wide tree. The banyan tree creates a very different feeling from that of any other tree I know of. It expresses the feelings of "spreading out" and "spreading deep". This is exactly what the project is intended to do. Also any tree in general provides shelter, and "Green is the color symbolizing earth, nature, and in a broader sense, life" : wikipedia.org.

Current status:
We are working on initial roadmap, and plans. Also we are trying to build a team of initial programmers, maintainers. All interested human beings are welcome to join IRC channel #banyantree on freenode.

Links:
banyanTree QA Rounds on IRC
Another IRC session on country specific issues

Saturday, May 13, 2006

Human Language Learning System

Its Exam time going on now... so cant think much :P But for few days two different ideas have been making rounds in my brain(ahh brainless has a brain o.O). The first one involves a different approach to UI of information kiosk type sites. I will discuss this one later with some examples.

But the second one seems more interesting to me. It comes from my direct need or wish to learn many global (human) languages other than English. I have never come across a community based language learning system, even though the community has been able to create and maintain such beautiful efforts like Wikipedia. I believe the reason is partly the absense of such a system. Here I will try to express my thoughts, which are just the surface of the system. I want more people to think so we have a real practical solution which can be easily implemented.

Learning human languages I hope is something liked by most people and it is important too. To learn a language easily we need some reference of the target (the language which we are going to learn) and some other language we already know. For example if I want to learn Spanish the easiest way would be for someone to start some simple verbal explanation or introduction to the language. This aided with visual examples like the written words of pictures of the object being described is very helpful. This case is very easy to do in real life or maybe over special direct audio/visual communication system. What could be a good solution is a right mixture of existing open protocols which enable text, audio, image and video transmission on internet and an application to present it all in the required manner. If all this seems a bit confusing then read the following example:

I want to learn Spanish. Assume that I find a person, say Jack who knows both English and Spanish and is willing to teach me. The general way is to introduce me to the alpahbet and to smaller words (with their respective English meanings). Starting with "A for Apple" thing is better when the picture of Apple is there (Apple maybe known to most people in the world, but there maybe other unknown things). The addition of voice in the background is needed.

Then we to some grammer stuff where it is about more explanation using voice and text. Next Jack could start showing be smaller paragraphs of text and read them out so that I can practice reading in my mind. This would help with the ability to underline the word (or words block) he is reading currently in a 10 line text. It will help me to keep track of the word he is currently speaking (as in some karaoke stuff I have seen).

This is the basing process. Now the following just brings out the important points in this system:
  • Text in target language and also support text in the known language is written by the author (a teacher for example). Text can be rendered to image, if there are issues with fonts. Jack simply types text in the application as a slide-by-slide basis (as in a presentation).
  • Thus a "Slide" becomes a basic block of the system. A slide can contain text, image, audio, video. The slide can be well described in terms of the items (text, image, etc.) it contains. Image, audio and video can be encoded using formats like png, vorbis, theora respectively or other better open formats that I dont know of. The slide itself is basically xml tags which describe what items it contains.
  • A number of slides make up a "Presentation". The presentation is an xml file. Simple :)
  • The items needed within the slides are simply "linked" by URL to the actual (image, audio, etc.) files.
  • The xml does contain the text though which is supposed to be displayed.
I am not going into the details of tags for xml file etc. But I will give an example of what Jack and I should find once such a system exists.

First lets see how Jack does his work:
  • An application exists which allows Jack to create new slide and add text to each slide.
  • On the first slide he types the Spanish equivalent of "A" and records his voice for the alpahbet. Also he may record a reference audio "Pronounced as in ..." or something else which helps in the learning process. In this slide the text is too simple and doesnt need an underline.
  • Images can be linked in each slide as they are linked in html. Ofcourse he may create his own image or link image from a public server.
  • Jack continues to process until he completes the full alphabet and that is thinks is enough for the presentation "Introduction to Spanish". Nice work Jack.
The presentation file along with the image or audio can be either be in separate files available through HTTP or similar, or in a single compressed file. The presentation xml file must have a certain name. The item files (image, audio, etc) if present in the compressed file are fetched from that file itself with the option to checkout the online URL for updated content. It is not compulsory that all the item files remain in the compressed file, in which case the URL is used to simply fetch them.

Now one fine morning I want to learn Spanish (atleast try ...). I simply search the well known spanish-pedia or something and get hold of Jacks "Introduction to Spanish". I download the full compressed file or simply click on the presentation xml. A browser plugin downloads the required items and I start learning... hurray... I know the Spanish alphabet.

Jack starts working on his second part of the series. He types few short paraghaphs which make good use of the alphabet. Here he adds underlines to text. He could do that in the following way:
  • Type out the full paragraph of a slide.
  • After typing is done select a word or few words (maybe a sentence) and do a "ctrl+U" or something similar. Also with the selected text: speak and record the words/sentence as they should be pronounced.
  • The application keeps track of the time sequence for each underlined segment and its audio. This can simply be expressed in the xml itself as timestamps.
In a similar fashion Jack completes the presentation and as earlier I use it to learn more Spanish.

Thats all for now... keep thinking. Discuss and comment your thoughts. Cheers!

banyanTree QA Rounds

Discussion on #banyanTree (freenode):

[14:31] ai2097: Hate to be Mr. Negative... but shouldn't the details about what is being provided be pinned down first? Is this intended to provide software/tools, a community, the latter built atop the former (which is the impression I'm getting), or something else?
[14:32] KillerX: A community based on a portal
[14:32] SumitDatta: ai2097 that is exactly what we are doing now...
[14:32] KillerX: We will not host any projects
[14:32] KillerX: Or provide any infrastructure
[14:32] SumitDatta: discussing with the people and pinning down details
[14:32] SumitDatta: and we are not providing hosting
[14:32] KillerX: Merely the middlemen in connecting developers to those who need them
[14:33] ai2097: KillerX: Then you're providing infrastructure.
[14:33] KillerX: Of course, if Google funds us, we can start project hosting too ;)
[14:33] SumitDatta: just think the HR department ...
[14:33] ai2097: The HR department works for a company.
[14:33] KillerX: Yeah lets not compare it to HR
[14:33] ai2097: This "HR department" isn't tied to a single project.
[14:33] KillerX: :)
[14:33] KillerX: Wrong comparision
[14:34] SumitDatta: ok HR isnt the appropriate word, but i found it the closest
[14:34] SumitDatta: the word "guide" is best suited i think
[14:34] KillerX: ai2097: A little info on yourself would be helpful?
[14:34] SumitDatta: guide to those who are new to the community as a whole
[14:36] ai2097: KillerX: I'm nobody in particular ;). Bug fixes here and there, personal projects every now and again. I poked around the GPLFlash project for a while; http://gplflash2.blogspot.com
[14:38] SumitDatta: ai2097 think of this is the following way : Google SoC lists projects from many orgs ... it helps students to look around and join some project. these are projects not just related to sf or freshmeat.
[14:38] ai2097: Freshmeat, AFAIK, doesn't host anything.
[14:39] SumitDatta: i meant projects listed there... not really hosted
[14:39] KillerX: You are right. Freshmeat is just an index of projects
[14:39] ai2097: So, why can't we just convince freshmeat and/or sourceforge to add "help wanted" sections to each project?
[14:40] SumitDatta: what i mean is that we provide the listing of what is important right now : like FreeBSD may need someone on Fonts for example
[14:40] SumitDatta: sf has help wanted, not well coordianted
[14:40] SumitDatta: there are many projects that are not reached in those sections
[14:40] ai2097: But it's up to FreeBSD to do two things: 1) know they need a fonts guy, and 2) put out a request for it.
[14:41] ai2097: You're assuming that projects can figure out (1) in the first place, and have a problem with (2).
[14:41] KillerX: ai2097: The primary goal of BanyanTree is to help new people who know development but are new to the open source scenario
[14:41] ai2097: From what I can tell, if a popular project has a problem or a need, word -tends- to get around.
[14:41] SumitDatta: what about the guys who knows fonts for linux, could do it for freeBSD in summer, yet didnt get the freeBSD page in the 2 months time
[14:41] KillerX: Just think of it as an all year Google Sumer of Code, without the money
[14:42] SumitDatta: yes
[14:42] ai2097: That assumes too much.
[14:42] ai2097: I'm doing a research paper on a tangentially related issue, and I've come across a study conducted in 2003 about OSS projects and quality.
[14:43] SumitDatta: somethings must be assumed : and we are just assuming that project maintainers will be happy to join us
[14:43] ai2097: Nonsense! Use research, and figure out who your target audience is -- then cater to the needs of that audience.
[14:44] KillerX: ai2097: Completely agree
[14:44] KillerX: That is what we are doing :)
[14:45] KillerX: ai2097: We do not intend to assume (1) or even (2)
[14:45] SumitDatta: KillerX not really, target audience and interested students are two diff worlds
[14:45] * ai2097 digs up the study
[14:46] SumitDatta: ai2097, once we pull in the roadmap and the initial plans things will be easier to understand
[14:46] ai2097: Zhao, Elbaum; 2002. Quality assurance under the Open Source development model.
[14:47] SumitDatta: link ?
[14:47] KillerX: ai2097: care to send the paper to me: anant@kix.in
[14:47] KillerX: Sumit, that is a research paper, goddamnit
[14:47] SumitDatta: the pdfs of many research papers are downloadable :)
[14:48] SumitDatta: so i just asked...
[14:48] ai2097: "Documentation did not play such a dominant role.
[14:48] ai2097: Over 84% of the respondents prepare a ‘‘TODO’’ list
[14:48] ai2097: (including list of pending features and open bugs). 62%
[14:48] ai2097: build installation and building guidelines, 32% projects
[14:48] ai2097: have design documents, and 20% have documents to
[14:48] ai2097: plan releases (including dates and content).
[14:48] ai2097: "
[14:48] ai2097: Ack >_o
[14:48] ai2097: That is not what I wanted to do. Sorry for the flood >_<
[14:48] SumitDatta: hehe
[14:50] KillerX: Hehe
[14:50] SumitDatta: ai2097 do you think SoC is of no use then ?
[14:50] ai2097: http: //scholar.google.com/url? sa=U&q=http://pegasus.rutgers.edu/ ~luyin/luyin.pdf
[14:50] KillerX: Yay
[14:51] ai2097: No, I don't think it's of no use, but it will only reach projects that either already know where they're going, or that don't exist (i.e., there's a technology hole), or that google itself (or someone elese) has architected extensions that it wants to see done.
[14:52] SumitDatta: orgs like Apache receive 25 or more students for 3 months projects : that lot of manpower : do useful projects. if such a permanent environment exist more students would happily spend summer and winter to code for some org even if there isnt money involved
[14:52] ai2097: Why does Apache need you to do that?
[14:53] SumitDatta: think not in terms of individual orgs... think interms of the community : each playing its role and we volunteers coordinating the site
[14:55] SumitDatta: think in this example point of view : you subscribe to a RSS feed which informs you of all C/C++ based projects involving Flash File Format....
[14:55] ai2097: Google is in a unique position to make itself known. I can appreciate the value of centralization, but how do you expect to become the next "SoC"-type setup when Google already exists, and (if I understand correctly) is what you're trying to emulate?
[14:55] ai2097: So, back to "why not integrate it with freshmeat"?
[14:56] SumitDatta: management problems of a new system with an existing one
[14:56] SumitDatta: thus point of starting from fresh
[14:56] ai2097: I have another study centering around the motivation of OSS contributors, too. Let me see if I can dig up a link.
[14:57] ai2097: Yes, but FM is already well-known, and already has a proven infrastructure in place. If you can extend that, instead of creating your own system, it will be more beneficial to everyone.
[14:58] SumitDatta: tell me something : when you visit FM homepage what do you see ?
[14:59] SumitDatta: if i am right i see lists of projects as in how and what they do.
[14:59] SumitDatta: we are concentrating on "how you can plug in to the project of your choice easily"
[15:00] ai2097: Well, how do you associate with a project?
[15:00] ai2097: Lemme dig up that link...
[15:01] SumitDatta: each projects has different *current* needs : from documentation to core programming. we speak to the people who are interested in devoting time to volunteer in a project : NOT users
[15:02] ai2097: http ://scholar.google.com/url? sa=U&q=http://opensource.mit.edu/papers/hemetsberger1.pdf
[15:02] ai2097: Ah, but therein lies the rub: what motivates people to participate in an OSS project in the first place?
[15:03] SumitDatta: what motivated you to jump into OSS... we all know its personal choice... i am not forcing anyone
[15:03] ai2097: You're assuming that people want to come in out of the blue and contribute their time to any project that needs help.
[15:04] SumitDatta: i am just guiding those who want to come, yet are not aware of many current details
[15:06] ai2097: In other words, the mindset is "Gee. I really want to contribute to OSS, and I have . But I don't know who needs me!"
[15:07] ai2097: Actually, I should phrase that in the form of a question; is that statement correct?
[15:08] SumitDatta: its not that way.... its rather "Who needs my skillset *now* " .... take my example : i am doing gdata for drupal, because i am php coder and am interested in rss/atom stuff. but if i landed with joomla then maybe i wouldnt be able to know that drupal is asking it
[15:09] SumitDatta: here time constraints is important : a student mostly wants to learn and work on topics he knows best
[15:09] ai2097: So, shouldn't it be tied to the bug tracker, then?
[15:09] KillerX: ai2097: It is a widely known fact that those who are willing to code, find it *very* difficult finding people who need help
[15:09] ai2097: KillerX: Sources?
[15:09] SumitDatta: ai2097 how can a student think of tracking all 400 + CMS in PHP ??????
[15:09] SumitDatta: that is insane
[15:10] KillerX: ai2097: Experience. Personal and others. Feedback. Grapevine. Not a research paper though
[15:10] ai2097: Really? I can't seem to throw a stick without hitting projects that need some form of help or another.
[15:10] SumitDatta: that is because you want to help in anything
[15:11] SumitDatta: most students as i said concentrate in topics they know of
[15:11] ai2097: But then, I'm being argumentative for a purpose, too ;).
[15:11] SumitDatta: afterall they are not supposed to know everything
[15:11] ai2097: SumitDatta: About your previous comment, regarding "tracking all 400 + CMS in PHP" -- I don't understand what you mean.
[15:12] SumitDatta: "ai2097: So, shouldn't it be tied to the bug tracker, then?"
[15:13] SumitDatta: which project's bugs should he track to find the PHP + Atom combination ?
[15:13] ai2097: Oh, that's different.
[15:13] SumitDatta: there are so many CMS in PHP out there..
[15:13] SumitDatta: that is exactly our point
[15:13] SumitDatta: it is difficult to get to right projects FAST
[15:13] ai2097: What I mean is, since it's "what skills are needed now," the best way to integrate the -demand- side would be to put it in the bug tracker.
[15:15] ai2097: But then, that would require any projects that want to participate have a bugtrack system that allows the devs to specify what kind of skills are required to fix the bug.
[15:15] SumitDatta: ofcourse ...
[15:15] SumitDatta: this is tied to the projects after all
[15:15] SumitDatta: but the site volunteers make the task easier
[15:15] SumitDatta: and thus the need for such a unique system from scratch
[15:16] KillerX: ai2097: When I finish the roadmap, I will send you a copy. It will address all your concerns
[15:16] KillerX: :)
[15:16] SumitDatta: and thus no relation to FM
[15:16] SumitDatta: but frankly ai2097 you have probably all the questions that other mentors could ask
[15:16] SumitDatta: so i am going to Ctrl+C anc Ctrl+V all the above lines :)
[15:17] SumitDatta: you mind that ?
[15:17] ai2097: Go for it :)
[15:17] KillerX: Sumit, did you copy yesterdays conversation with Leslie?
[15:17] SumitDatta: i will put this in the blog... we both have tried to answer your questions
[15:17] KillerX: We should really log the channel, esp in the initial stages
[15:17] SumitDatta: KillerX, i forgot that i didnt have a log of that
[15:17] SumitDatta: do you have ?
[15:17] KillerX: :(
[15:18] KillerX: It's ok
[15:18] KillerX: We remember ;)
[15:18] SumitDatta: Leslie could have...
[15:18] KillerX: I'll ask her
[15:18] SumitDatta: ok
[15:19] SumitDatta: KillerX, i think we both did a great job in actually clearing out initial doubts since mentors will surely have questions
[15:21] KillerX: Sumit, I updated the banyanTree repos to use SVN instead of CVS
[15:21] SumitDatta: nice
[15:21] SumitDatta: you rock
[15:22] ai2097: I'm going over to a different screen; just say my name if you need me back.
[15:22] SumitDatta: ai2097, the way we see it, this project could help thousands of students around the world to get into FOSS
[15:22] ai2097: That was fast :p.


Discussion on #gnash (freenode):

[16:58] brainlessV2: tgc hi
[16:58] brainlessV2: i found out Gnash was there for Google SoC
[16:59] tgc: hi brainlessV2
[17:00] tgc: i wasent sure we even applied... but good!
[17:00] brainlessV2: also i have something that should interest you :
http://sumit.pixlie.com/2006/05/banyantree-project.html
[17:00] brainlessV2: ai2097 knows of this and has been discussing a lot with us
[17:07] tgc: brainlessV2: interesting, but hasen't this been done before?
[17:07] tgc: isn't sourceforge kindof the the same, except they also
hosts the projects?
[17:07] brainlessV2: not quite at this scale or that this involves all
the orgs mentors for ground up
[17:08] brainlessV2: banyanTree is intended to be very tightly knit from inside
[17:08] brainlessV2: also sf doesnt deal with projects outside sf
[17:09] tgc: is it intended to be SoC projects, or all FOSS projects?
[17:09] brainlessV2: and bt is primarily for the fresh minds that are
coming in or interested to come in to OSS
[17:10] brainlessV2: all FOSS ... but we think we will start with SoC
group since the mentors were already chosen and they wont mind helping
us a bit, i guess
[17:10] brainlessV2: we will have long discussions with all
participating orgs in one room... before that we are doing the initial
setting up of wiki etc so we can express the initial ideas
[17:11] brainlessV2: right now we are planning the roadmap and plans etc...
[17:12] tgc: you could just use a wiki to hold all the information in
BT, couldn't you?
[17:12] brainlessV2: no, because information gathering and
presentation needs huge systems
[17:12] tgc: isn't wikipedia huge?
[17:12] kjetilho: consider SF, or Freshmeat. there are tens of
thousands of projects. it simply won't come together as a whole
naturally
[17:13] brainlessV2: since at the end we want to be able to help many
thousands of students and similar number of projects
[17:13] kjetilho: tgc: Wikipedia isn't made for interactive discussion
[17:14] tgc: kjetilho: true
[17:14] brainlessV2: we dont want all of them to come as a whole : if
the initial orgs come : others will feel the benefit and join
[17:15] brainlessV2: it is difficult to put something of this scale :
but we must try
[17:15] tgc: i also see a big problem with dead projects, SF got loads of them
[17:16] brainlessV2: we are thinking of ways for that
[17:16] tgc: good :)
[17:16] brainlessV2: what we are trying to do i guide those students
at the school and college levels
[17:17] brainlessV2: they have skill yet are not aware of the details inside OSS
[17:17] brainlessV2: people do have trouble finding a particular
project of interest out of so many...
[17:18] brainlessV2: we want to fill that gap : so we need to have bt
as a much more user-friendly and shinny in a way maybe
[17:18] brainlessV2: in the end we will slowly be able to build teams
of developers around the world whom we have been able to show the way
: that is the goal
[17:21] brainlessV2: tgc, kjetilho you mind if i link to this sessions
log : So that people with similar query will get an idea ?
[17:22] tgc: have you thought about teaming up with some of the "big
players"? like OSTG?
[17:22] brainlessV2: yes we have thought of that...
[17:22] tgc: fine with me
[17:22] brainlessV2: but before that happens we need to put up our plans etc.
[17:23] brainlessV2: right now i am trying to install the darn wiki on sf.net ;)
[17:23] tgc: brainlessV2: good luck! wiki on SF is painfully!
[17:24] brainlessV2: yes that is what i can see
[17:24] tgc: which wiki will you use?
[17:24] brainlessV2: dokuwiki ... it is the simplest
[17:24] brainlessV2: but we have tried media, doku, and tiki
[17:24] brainlessV2: so i am retrying doku
[17:25] tgc: you could try moinmoin
[17:25] tgc: worked for gplflash
[17:25] brainlessV2: yes but i am newbie in python
[17:26] brainlessV2: though we plan to move to python
[17:26] brainlessV2: since it will give all flexibility for building
such a nice bt application

Friday, May 05, 2006

GData module for Drupal

Introduction :

GData is a (new) protocol from Google which is based on RSS and Atom and combines both of them. Infact underlying GData are actually RSS and Atom protocols. GData makes available :
request (syndication), query (for search), insert, update and delete.

All these together make *remote* usage of a CMS much more a reality and since Google is behind this, there is a good chance of this becoming the de-facto standard in future.

A little deeper :

GData allows for content to be syndicated as well as inserted, updated and deleted. Most importantly a seperate specification for query. Nice! GData uses XML as described in existing Atom and RSS specs for all these. Different features, btw, are supposed to use different HTTP request methods as outlines below:

Feature HTTP method
Request / QueryGET
Insert POST
UpdatePUT
DeleteDELETE

Note: There are alternatives to HTTP PUT and HTTP DELETE : clients can use headers 'X-HTTP-Method-Override: PUT' and 'X-HTTP-Method-Override: DELETE' for PUT and DELETE respectively.

There is also the authentication part in GData which as Moshe pointed out in drupal.org, is Google specific, and moreover work on it is still in progress. So for now I have not much concentrated there. I hope that area will get better soon.

Inside Drupal :

As far as I have studied, some problems which exist in relation to GData are:
  • Handling HTTP PUT and DELETE. PUT and DELETE do not seem to work well on all servers and clients across platforms. The headers ( X-HTTP-Method-Override: PUT and X-HTTP-Method-Override: DELETE ) ofcourse come to the rescue and to me the headers seem to be the best solution.
  • Queries have to use the URL format: site.com/myFeed? q=query-string Here, as again Moshe pointed out on drupal.org we, can not use the q part. In discussion with praseodym on #drupal-soc, the best solution seems to create a seperate file named say gfeed.php which handles user requests and passes them onto drupal after doing modifications as necessary. So for all GData related stuff there will exist clients will use URL site.com/gfeed.php?q=query. gfeed.php is also used for non-query purposes like normal requests, insert, etc. gfeed.php ofcourse sends the request to drupal after modifications to the incoming request as necessary. Other solutions like using .htaccess also exist.


What is needed in GData module and how it is done:

The GData module needs to define an API through which any module can register itself to expose data.
Since GData allows for Insert or Updates, modules need to specify the access permissions too.
There will be a file (say gfeed.php) which will take in actual user requests since in query we need "q=something" format in URL and "q" has special meaning inside Drupal. Thus normal bootstrap will not work, which brings in the need for gfeed.php. Users/clients access Drupal GData in URL format : site.com/gfeed. When they need to query they do site.com/gfeed?q=something gfeed.php then turns the request to Drupal after doing required changes.
Inside actual module the response is built and sent back to the user as XML.

Links :

GData overview
GData protocol
Drupal page on GData module
Roadmap for GData module
Me in a Drupal project

Related Links Elsewhere on the GData buzz :
GData - Google's new syndication protocol : From ZDNet.com
Google's GData, MySQL, and the Future of on-line Databases : Jeremy Zawodny's blog
Why Google is extending RSS : From ZDNet.com
Google Data APIs Protocol : Joe Gregorio
Google and RSS: GData : By Vincent
Google syndication : By Jeff Jarvis
GData - Google BCM protocol : By Deeje Cooley
GData: The end of Google's walled garden : By Maurice Codik
GData is a new protocol based on Atom 1.0 and RSS 2.0 : By Karl Martino
GData is about more than Google Calendar integration : By Mark McLaren
GData: Google's Extensible API : By Reto Meier
Google introduces GData: Google Calendar API : By Amit Agarwal

Also ASF has an application titled "Implement a Google Data API (GData) server using Lucene" for SoC 2006.

Roadmap for GData module work

How to successfully complete this project ?
Well the answer follows as a roadmap:
  1. Atom module presents the features of Atom syndicating protocol and it seems like a good place to start experimenting.
  2. Need to specify which parts of drupal is exposed through GData mudule. We also need to consider read-only and read/write parts seperately.
  3. Tweaking Atom module to make it take in requests as GData specifies.
  4. Creating a gfeed.php which actually takes all client request and turns them onto drupal.
  5. Test the gfeed.php and GData module across certain test conditions.

Till this I expect I need a month's time, by which we may even have better solutions to the HTTP PUT / DELETE and gfeed.php issues.
  1. Support for basic query stuff through the GData module. ( remember the 'q' thing ? )
  2. Atom has a publishing protocol too, which however is not implemented in Atom module, but is needed in GData. Thus the GData module now needs to grow into supporting updations, insertions, deletions.
  3. Implementing authentication in GData. I except by this time Google will have done more work on this end and impletemention will be easier.
  4. Testing as usual.

Again this above block should be a month's work.
  1. Optimise the module's query part so that it is less time and resource consuming.
  2. Allow the community to test the mudule and go into a full bug-fixing mode.

At the end of this we should have a full working GData module for Drupal.

Thursday, May 04, 2006

Me in a Drupal project

Me in a Drupal project :
I am a PHP coder for almost 3 years now. I have knowledge of C / C++ and so I chose PHP 3 years back for web programming due to the similarity. I have worked previously for 123greetings.com where I had to plan and look over the building of an internal content management system to manage around 30,000 ecards and the associated data. Though in there the primary coding language is Perl, and I was not a coder for that system. From there my interests in content management grew and presently I make some CMS for others.

I am new to the
Drupal API, but for last few days I have tried to get a grip of the API. I have discussed a lot with webchick on various matters related to understanding the API. I am at present comfortable with it. Since I have studied and/or used other API's previously ( like the Win32 API, Berkeley db, DirectX or Irrlicht ) I understand well the base concept of APIs. The Drupal looks nice and ofcourse very *do-able*. I have been through the "hooks" part and also studied the emample mudules. Also I have installed and looked thoroughly into the Atom module in particular since it is related to the GData module. I seem to understand around 70% of what is going on in the Atom module. I am a fast learner and have absolutely no problems in learning things from different shperes.

I have also taken time to look through the coding standards within Drupal community. As for my own coding style, I have here full sources of an emulator project I am doing for my college with a team of two other students : it is in C++ and uses Irrlicht API. Also here is the source to a CMS I recently developed for www.timeline-studios.com : in PHP.

I am a student of
BSc (Bachelor of Science) in Computer Science from Maharaja Manindra Chandra College, Kolkata, under Calcutta University, India.

I have previously been there on some Open Source projects like GPLFlash, though my contributions are little. I am present on PC-BSD and Irrlicht (Open Source graphics lib on Win) IRC channels. I have used CVS and SVN, but for read only purposes.

With support from the mentors (particularly
webchick helped a lot) and the Drupal community, I am sure I will not have much problem in implementing this module.

Wednesday, May 03, 2006

Further Clarification on Apache SCT

Further Clarification on Apache SCT (Simple Config Tool)
With regard to discussions on #httpd-dev channel on freenode yesterday I am posting some more details:
  • The tool is not supposed to be a *get everything done* thingy. It will simply not read everything from your mind and do all settings.
  • The main aim is to make it easier to understand the details and be a helper in self-study of the Apache config system, mainly the directives.
  • The tool will simply not write anything to the filesystem (easlier I had thought of keeping this as an option) because of security reasons.
  • The user can download a copy of generated .htaccess and place the file him/her-self on the filesystem.

Updates :
May 4th 2:06 IST:
  • Internally the tool must store temporary data (or directives) in some temporary files etc. So that the tool can remember some settings the user had made previously and can revert back to it when needed. This will tremendously help a person to make experiments with the directives and learn.
  • The tool can be made as a mod like mod_asct (Apache Simple Config Tool). Since after we discussed on #httpd-dev, I intend not to put any write capabilities and only enable the download-and-copy-paste system so I guess if users want it then they can use a mod. This will be more helpful to admins of multi sites.

Apache Simple Config Tool

Simplifying Apache configuration :
The intention of this project is to make a tool or application which will greatly simplify the configuration process for Apache web server once it has been installed. The tool can be compared to phpMyAdmin (which exists for MySQL). phpMyAdmin is extensive in its approach in helping users to use a MySQL database server. It does not imply that users cannot access MySQL without phpMyAdmin, but that it makes maintance a lot easier. Content on the internet is not created only by those who can understand details of .htaccess files. People from non-tech spheres create a lot of content. They have to maintain web sites too. The process of learning details of Apache configs without a useful GUI is a burden to most such people. A tool which will help in the process is very much needed.

Benefits to the Community :
Through this project I intend to create a tool (for example a PHP/Perl/Python based GUI application) which will simplify the process of configuring the Apache web server. The project can also be extended to make Apache log files easier and more helpful to general non-technical audience.

Deliverables :
An application written in either PHP/Perl/Python or a combination which enables easy GUI based configuration of Apache web server. Also I may implement an extension to make log files more understandable / readable and useful to the general user.

Project Details :
The GUI tool (like phpMyAdmin) will help configure and understand the settings of Apache web server. It will have forms in easy english (maybe other languages too) which ask questions like:
What is the domain name of this site or Where are the files located for X website, etc. Question forms have to be well designed so that user dont get confused, or overwhelmed by the number of questions. The tool will have to use inbuilt facilities available in the programming languages or environment (like $_SERVER['SERVER_NAME'] and similar data through PHP) to provide useful options and suggestions to the user. Config options can be generate when requested by the user so that he/she can copy and paste them to a .htaccess file.
A second module can be addedd which provides some important imformation from Apache log files (like browsers, host IP, etc). This is available in other specific tools, but adding to this config tool will make a nice module.

See a sample form for the Apache conf tool
Simplifying Apache configuration?(Original blog post on this idea from Matt) More comes here (details particular for SoC)

8085 Emulator

The College project : An 8085 Microprocessor Kit Emulator
The project, done in C++ tries to emulate the functions of an 8085 emulator ditto as a hardware trainer kit would work. The purpose is to create a software based 8085 Trainer Kit which looks and works similar. So this one does not have buttons like [ADI] or [MVI]. Instead it is all HEX code stuff buttons and the buttons for Rst (Reset), Set, Ins(Insert)... etc. as one would find in a hardware based 8085 trainer kit.

The project isnt yet fully complete. The sources are available here :
defines.h
interpreter.h
interpreter.cpp
memory.h
memory.cpp
main.cpp

All sources available under GNU GPL
This project uses the Irrlicht library for UI : Irrlicht

Update (4th May 1:42 AM IST) : At last the code works. It compiles without errors and I have tested with some very simple assembly code. It works ! But some more work needs to be done...
An archive of all files : here

Monday, May 01, 2006

Sample Settings Form for Apache Simple Conf Tool

A Sample Settings Form: Welcome back, Sumit Edit settings for your website Website domain Tip: This is the domain name for this site. Do not use http:// I have detected that the current domain is sumit.pixlie.com Website directory Tip: This is the directory containing all the files (text, image, audio, video, etc.) for this domain. I have detected that the current directory is /home/sumit/pixlie Website default page Tip: This is the page that you want users to see when they come to http://sumit.pixlie.com. but do not specify any particular page like blog.html This can be any valid webpage name like index.php, home.php, index.html, default.html, etc.

Apache Simple Config Tool Application for SoC 2006

Name :
Sumit Datta

Email :
sumitdatta@gmail.com

Project Title :
Simplifying Apache configuration

Synopsis :
The intention of this project is to make a tool or application which will greatly simplify the configuration process for Apache web server once it has been installed. The tool can be compared to phpMyAdmin (which exists for MySQL). phpMyAdmin is extensive in its approach in helping users to use a MySQL database server. It does not imply that users cannot access MySQL without phpMyAdmin, but that it makes maintance a lot easier. Content on the internet is not created only by those who can understand details of .htaccess files. People from non-tech spheres create a lot of content. They have to maintain web sites too. The process of learning details of Apache configs without a useful GUI is a burden to most such people. A tool which will help in the process is very much needed.

Benefits to the Community :
Through this project I intend to create a tool (for example a PHP/Perl/Python based GUI application) which will simplify the process of configuring the Apache web server. The project can also be extended to make Apache log files easier and more helpful to general non-technical audience.

Deliverables :
An application written in either PHP/Perl/Python or a combination which enables easy GUI based configuration of Apache web server. Also I may implement an extension to make log files more understandable / readable and useful to the general user.

Project Details :
The GUI tool (like phpMyAdmin) will help configure and understand the settings of Apache web server. It will have forms in easy english (maybe other languages too) which ask questions like: What is the domain name of this site or Where are the files located for X website, etc. Question forms have to be well designed so that user dont get confused, or overwhelmed by the number of questions. The tool will have to use inbuilt facilities available in the programming languages or environment (like $_SERVER['SERVER_NAME'] and similar data through PHP) to provide useful options and suggestions to the user.
Config options can be generate when requested by the user so that he/she can copy and paste them to a .htaccess file.
A second module can be addedd which provides some important imformation from Apache log files (like browsers, host IP, etc). This is available in other specific tools, but adding to this config tool will make a nice module.

Project Schedule :
The project will take about a month's time. Then we can release a beta and wait for suggestions. For this kind of tool there will be immense user suggestions. The next pahse of more than a month will be spent on suggestions, improvements and bug fixes. Once the tool is stable, I can move onto creating module for log file handler.

Bio :
I am a student of Computer Science from India. I am a part time PHP/MySQL based web developer. I have previously worked for Intrasoft Technologies (www.123greetings.com : yes it runs from India !). There I worked on making an internal content management system which was tightly linked to the employee structure of the company. Through that I have gained lot of insight into how easy it should be to put up content on Internet. But sadly it isnt. I have worked with artists and creative artists who create wonderful content for the net, yet dont mind giving a few extra blank spaces in filenames. Internet is for the masses and I understand how important it is for tools to exist which help ordinary people to create and manage content and sites. I am myself working on a plan for a CMS which takes into account my past experiences and current state of internet. Ofcourse again my emphasis is to empower the common man to be content creator and maintainer. I have been associated with Open Source projects for about 2 years now, but contribution is not satisfactory according to my judgement. I want to do something really useful (like this project) for the common content creators of the world. I have a C,C++ background but I live in internet technologies.
Thanks

About Me

Oh hello,
2006:
I am Sumit (brainless or techfundas on some IRC channels). A
student
web developer from India. Interested in Computers, Music, Movies. You know the normal college going normal stuff :)
I am a PHP coder and work on websites. I love making CMS for custom needs. Non-tech people generate a lot of content on the Internet, so it is necessary that we have easy to use CMS for them. I participate on Open Source projects. Like i have been there in GPLflash at some time, some PHP based CMS. Though my contributions are tiny but I try to help.
My dream subjects would be Automation and AI... and I wish someday I will be able to able to study in some University. Also working on such topics will be like wow!
My favorite topics (non-dreams =P) in Computers are Operating Systems, Internet Technologies, human computer interaction on the web, UI design, web standards.
Hurray!! Got Accepted in Google Summer of Code 2006 for Drupal. See here
2009:
Yeah that was sometime back. Now I am heading a team in Kolkata. A small start-up. We have created hispanito.com, bejant.com. showmethegolf.com, comedy.fm and celeb9.com. We are planning some other websites too. And yes currently looking out for investment.
In my spare time I do consult people in developing websites, web applications, upgrading, maintaining, scaling them. If you need me just reply back to this post. And I will get back to you. Here is a list of technologies on which I usually work lot and you can expect help from me:
  • Linux (Debian, Fedora, Gentoo): usual web server setup
  • Apache 2.x or nginx
  • Varnish
  • PHP (I do not do much OOP, just where really needed)
  • memcache: Have been using cache on high volume websites for last 2+ years
  • MySQL, general SQL stuff.
  • HTML / CSS
  • JavaScript / AJAX using the jQuery library
  • Amazon Web Services, Amazon EC2, Amazon S3
2011:
I am still alive in 2011! To be honest I am more alive now than ever. We work on forums.com which is a group communication application. It is a hosted application and competes with what Google Groups, Yahoo! Groups, Ning or Facebook Groups has to offer. There are in fact many other such applications out there but we want forums.com to be the best. We cater to all kinds of groups of people from volunteer run Non-Profits, activity clubs, to corporates with team strengths of 10 to 1000. 2011 is a big year for us since we are making our first public launches this year. We have high hopes and we continue to push further each day.
2012:
Wow 2011 was a year with twists! Forums failed to achieve its goals and the project closed down. I joined MobStac as a Framework Engineer. Had loads of fun working on Python (a little), AWS infrastructure and HTML5 on iPad (a lot). We launched TouchSite, the iPad (later other tablets) HTML5 app for websites. It looks great and I am surely proud of what I helped build.

But time to move on, and 2012 is back to being an entrepreneur. We have a couple on ideas in mind. My wife is with me, and so are many of my dear friends. Lets rock!