Search This Blog

Tuesday, January 31, 2012

Piwik best alternative to Urchin! Web Analytics via Log files import - Piwik

Piwik best alternative to Urchin! Web Analytics via Log files import - Piwik:

Urchin development and support will be discontinued by Google as of March 2012. Urchin was a Log Analysis softwarebought by Google in 2005. They have used this software as a base for Google Analytics and have now announced they will focus exclusively on Google Analytics. We have since received a number of emails from Urchin users, asking if Piwik could be setup to do the Log Analysis the same way Urchin was doing it, and import all past logs in a Piwik server.

We are happy to say that we have been developing a powerful, simple to use script that will analyse your webserver log files (Apache, Nginx, IIS, Akamai, etc.) and will import visits, page views in Piwik.

We hope that in the next few months, Piwik will become the best alternative to Urchin and AWStats (and others).

Piwik Features when used to Import Log files

Piwik normally uses Javascript code to track visits and pages. This new script will also make it easy to track visits by importing one or many web server log files in Piwik. This is useful if you are not able to add the JS code to the websites, or if you wish to import large amount of historical data at once, or if you are looking for a software that does the same thing as Urchin, AWStats, Webanalyzer or Webtrends.

Some features of the Piwik Log Import script include:

  • Great performance, we have tested to track several millions of log lines per day with success. See Piwik for high trafficcheck list.
  • Bot traffic is automatically excluded: to keep your Web Analytics report clean and useful, and increased performance.
  • Piwik can track websites with the standard Javascript code, and other websites could be tracked by importing the access logs. For example Javascript tagging for website 1 and 3, and Log import for site 2 and 4. We expect these hybrid Piwik servers to become a common configuration among the community.
  • File downloads appearing in the logs will be automatically tracked as "Downloads" in Piwik
  • Because Logs will be imported via the Tracking API, all Piwik features will be supported (Goal tracking based on URL, IP Anonymization, Visitor log, etc.)
  • Some reports will have no data because the log data is more limited that data obtained via Javascript. For example: screen resolutions, Supported Browser plugins, Custom variables, Ecommerce Analytics will not work.
  • This script will effectively replace Apache2Piwik, the new tool providing more features and better performance.
  • In later versions we are planning to support Log reprocessing, Error code tracking, Search engine & spam Bot tracking, Feature to use the logs to enhance existing JS tracked pages, and more (based on user popularity and feedback).

This script will be written in Python and will be released under the GPL license, for Free (just like Piwik!)

Perfect for Web hosts company and Web agencies, but also for a one-off log import

The script will have 2 modes:

  1. Web Host – Web Analytics Provider user
    This mode is ideal for Web hosts, where new websites are often added in the access logs, but the Piwik admin does not wish to create manually each website. The script will automatically detect the Piwik website ID to track based on the URL being parsed: it will look for any Piwik website registered with a URL or "Alias URL" set to this page view host. If a website with the hostname doesn't exist, a new website is automatically created for this URL.
    A summary is then emailed to the Piwik Super User so he/she knows which websites are automatically created by the Log import script, so you can create users or assign permissions to view these new websites.
  2. Simple Log Import for one or a few websites only
    This mode is ideal if you import only a small number of websites or if you wish to control exactly in which websites requests are tracked.
    When a line contains a URL to an unknown Piwik website, Piwik can either ignore all these pageviews, or you can choose to record these unknown pages in a specified "catchall" Website id, to double check they are not legitimate pageviews.
    If these unknown URLs turn out to be legitimate pageviews, you can either create a new website manually, or add an Alias URL to an existing website, so the page URLs are directly tracked in this website the next time you import similar logs.

Join the beta testing group

To be part of our beta testing group, please email us at and mention the Testing of the Urchin/Awstats log import script. Please also mention the number of websites to track, how many pages per day, and if you are willing to test the script and report bugs or feedback.

This work is sponsored by Alwaysdata, a French web hosting company. They provide Piwik as the Web Analytics package of choice, deprecating AWStats, for thousands of their users. They have been using Piwik for a few years and we are finally integrating this Log Import Analytics key feature in Piwik, and ensuring good performance for the script. We want to make it easy for Web hosts and large Web Agencies to use it as their Web Analytics platform.

Goodbye Urchin + Scale of Google Analytics in 2012

The Google Analytics team have decided to focus on the privately hosted Google Analytics (GA) service and discontinue the Log Analysis version (Urchin). At Piwik we are quite simply amazed at the scale and reach of Google Analytics in 2012: GA is used by over 55% of all Internet websites (source). At least 15 million websites use Google Analytics! (source). In comparison Piwik is used on 1% of the Internet (cheers!) and 250k+ websites.

Million of pings (page views) are tracked by GA per SECOND. It is hard to imagine such scale which would make any software developer speechless. We can only congratulate Google engineers and product designers for the work they are doing to track and aggregate so much data, while allowing to slice this data in real time across dozens of dimensions. This is an amazing technical milestone. We also hope that Google users privacy will be respected and privacy standards will improve in the future.

Regarding the end of Urchin, we at Piwik will do our best to provide to existing Urchin users a good experience to upgrade to Piwik, and try the leading Free software platform. If you are a Urchin user and would like to try Piwik, send us an email us with your current setup. We will help and check similar functionnality is do-able with Piwik and the log import script.

Privacy & Security implications of self hosting your Web Analytics data

Ensuring the full control over your customers Log files, Piwik database are important requirements if you are a Web agency or a Web host providing web analytics to hundreds or thousands of users.

The tips in the Privacy page will help ensure that you make the changes to data collection and data retention required by your Privacy Policy. We also focus on Code security and recommend to all Piwik users to spend some time securing the Piwik server.

Piwik also an alternative to AWStats, Webalizer, : modern UI, better performance, and more!

We hope that Piwik will become the leading alternative to Urchin and to AWStats. AWStats was a great tool but we hope to modernize the Log analysis open source software world and make use of all the great Piwik features and capabilities in terms of data analysis and graphing. Users in 2012 and beyond will need a modern interface to access the data gathered from their Web server Access logs.

No comments:

Post a Comment