#100 – Data from our blog posts

Well, already the first three-digits post. Thinking about that fact is pretty cool. Today I will create some tools to extract data from WUFAR’s database, analyze it and give some insight on those first 99 articles that gave life to this project.

Let’s go !


A little comment on WordPress database structure

All posts created are stored in a table “prefix_posts”, inside it, you’ll find the different blog posts with their content, title, author and so on. I’ll do some quick scripts to be able to interact with that data. I will do my scripts in Python, because I love the language, first I’ll have to do the connection part. To do that I will use the library “mysqlclient” (installable with pip).

import _mysql

db=_mysql.connect("localhost", "user", "pass", "tooboat")
db.query("SELECT user()")

print r.fetch_row()

The database will return us the correct user(). We can now do some stuff with the database and extract interesting data.


Some facts

We could, for example, see the categories with the most posts:

  1. Architecture, with 19 posts
  2. Binary Exploitation with 13 posts
  3. Penetration Testing with 8 posts


Here are the posts with the most draft and edit made to them:

  1. #16 | ph00 – Getting used to manual photography, with 54 drafts
  2. #02 – Visualizing a Rubik’s solver in 3D using the Blender Python API, with 49 drafts
  3. #31 | arc00 – Architecture: correlation between human body with space and time, with 37 drafts
  4. #24 | bin02 – End of the ret-to-libc exploitation, with 34 drafts
  5. #53 | f00 – Fiction: A travel to another world, with 28 drafts

Looks like I edit my posts a little bit too much, there are still 5 articles with only one draft.

Here are the most used words used, I’ll do the same for nouns just after:

  1. the, with 4289 usages
  2. to, with 2332 usages
  3. a, with 1986 usages
  4. and, with 1483 usages
  5. of, with 1468 usages
  6. is, with 1216 usages
  7. I, with 1076 usages

And for the most used nouns (I won’t put words that could be used as verbs or pronouns, such as “work” for example):

  1. space, with 109 usages
  2. chunk, with 71 usages
  3. way, with 65 usages
  4. point, with 63 usages
  5. data, with 61 usages
  6. address, with 61 usages
  7. mars, with 59 usages
  8. free, with 56 usages
  9. architecture, with 55 usages
  10. object, with 54 usages
  11. human, with 52 usages
  12. memory, with 48 usages
  13. example, with 48 usages
  14. article, with 46 usages
  15. structure, with 44 usages
  16. program, with 39 usages
  17. body, with 36 usages
  18. bins, with 35 usages
  19. things, with 35 usages
  20. perception, with 34 usages

Thanks for reading! 🙂

