For a corpa to use for an as-yet-unnamed project I’m working on, I’ve been struggling with the unwieldy wikipedia XML dump.

1.4gb of pure XML wikicontent. A huge pain to import however, since SQL dumps are no longer directly released. I had to install mediawiki’s (the software that wikipedia runs on) database structure (in the source code its in maintentance/tables.sql), then run a java program called mwdumper to create an enourmous SQL file. All of that didn’t take very long, what’s taking a while now is actually importing that SQL file.
Wikipedia’s XML Dump, MySQL and PHP
| Posted on Aug 7th, 2006 | Comments |
- Related Posts
- No related posts
I'm a social media marketing & viral marketing scientist; read my bio here. I wrote a book and I love speaking about social media and viral marketing. If you like my stuff, subscribe to my feed, follow me on Twitter or email me.
Get my posts sent to your inbox by entering your email below:
Recently Popular Posts
Key Posts
- Intro to Memetics: What is a meme?
- What is Viral Marketing?
- Applied Memetics
- How memes encode themselves on our brains
- Tipping Points do Exist: Informational Cascades
- Ideas do not spread because they are good
- Urban Legends
- The Goliath Effect
- Why people forward chain letters
- Gossip
- Proverbs and Sayings
- Homeric poems and the oral tradition
- How to make and spread rumors
- How to stop rumors
- The science, history and how-to of contagious laughter
- Viral Marketing Examples
- The Spoon Model
- Viral Seeding Must-Haves
- ReTweets
Recent Posts
- Data Shows that Negative Remarks Lead to Fewer Followers
- Watch The Science of Social Media Marketing Webinar on YouTube
- Data Shows That Self-Reference Does Not Get Followers
- Data Shows That Social Behavior Gets More Followers
- The 8 Elements of Contagious Ideas
- Zombie Marketing: How to use Combined Relevance to Go Viral
- Introducing the New TweetPsych and TweetPsych for Lists
- My Predictions for Social Media Marketing in 2010
- Free Science of Social Media Marketing Webinar
- ReTweet this to Win a Free Copy of The Social Media Marketing Book
Topics
Blogroll
- Alison Driscoll
- Dan Zarrella at O’Reilly
- Dan Zarrella’s Kiva Portfolio
- Even The Government Does It
- Marketing ROI
- Muhammad Saleem
- Music Marketing
- NJ SEO
- Pinoyskull
- SEO Consultant
- SEO Consulting
- SEO Toronto
- The Social Media Marketing Book
- Way Back When
Copyright © 2010 by Dan Zarrella, social media marketing and viral marketing consultant. All rights reserved. site map
DanZarrella.com, Social & Viral Marketing Scientist
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.








