MySQL: Fantastic adventures – WordPress users and their roles

This was exciting. After years of neglect, I have decided to examine, who is registering on my WordPress site, and what is that they are doing. I knew that most of them are just a result of robot activities, and they are just a dead weight in the database (useless users). However, what about the active users, who post in forums, and leave comments, and return regularly? Let me identify and cherish those.

I knew I needed a monstrous SQL statement to link several files, and to count various record types, and sort all this in alphabetic order. So the plan was to learn the WordPress (WP) table structure, to start with just user table and slowly add one table at a time be able test my progress with each and every step.

First, what are the tables in WP? What fields are stored in those tables, and how these tables linked together?

=01= Tables in WP database

 

MySQL_Shell

show tables;

wp_commentmeta
wp_comments
wp_links
wp_options
wp_postmeta
wp_posts
wp_term_relationships
wp_term_taxonomy
wp_terms
wp_usermeta
wp_users

 

We should start exploring 3 tables: wp_users, wp_usermeta and wp_posts.

=02= Fields: users and posts

 

SELECT column_name FROM 
information_schema.columns 
WHERE table_name = 'wp_users';
SELECT column_name FROM 
information_schema.columns 
WHERE table_name = 'wp_posts';
column_name
-------------------
ID
user_login
user_pass
user_nicename
user_email
user_url
user_registered
user_activation_key
user_status
display_name
column_name
---------------------
ID
post_author
post_date
post_date_gmt
post_content
post_title
post_excerpt
post_status
comment_status
ping_status
post_password
post_name
to_ping
pinged
post_modified
post_modified_gmt
post_content_filtered
post_parent
guid
menu_order
post_type
post_mime_type
comment_count
- - - - - - - - - -
SELECT column_name FROM 
information_schema.columns 
WHERE table_name = 'wp_usermeta';
column_name
-----------
umeta_id
user_id
meta_key
meta_value

 

What type of entries are stored in your Posts table?

=03= Type of posts

 

select post_type, count(*) from wp_posts group by post_type;

post_type          | count(*)
-------------------+---------
attachment         |     NNNN
forum              |       NN
nav_menu_item      |       NN
page               |       NN
post               |      NNN
reply              |     NNNN
revision           |      NNN
topic              |      NNN

 

I discovered that bbPress items identified in wp_posts table as either ‘topic’ or ‘reply’. Moving one-step forward, I am now ready to list all the users, who ever started a new topic, or replied to an existing topic.

=04= Forum posters

 

SELECT SQL_CALC_FOUND_ROWS a.ID, a.user_login, LEFT(a.user_email,18) 
AS EMAIL, LEFT(a.display_name,12) AS NAME, count(*) AS COUNT from 
wp_users a JOIN wp_posts b on a.ID = b.post_author where b.post_type 
in ('reply', 'topic') group by a.ID, a.user_login, a.user_email, 
a.display_name ORDER BY a.user_login;
SELECT FOUND_ROWS();

ID   | user_login     | EMAIL              | NAME         | COUNT
-----+----------------+--------------------+--------------+------
 NNN | AAAA           | xxx@xxxxxxx.com    |              |    NN
  NN | AAAB           | yyy@yyyyyy.com     |              |     N

 

This is already a very powerful statement, but it is time to define exactly, what is that I am looking for? I need a list of users, who either:
– changed their default password, or
– left a comment, or
– posted in forums, or
– posted an article.

That will involve linking 3 or 4 tables together. Let us start with a challenging task of identifying the most humble set of users, who just changed their default password.

What information is stored in wp_usermeta? Let’s see information for just one user:

=05= Info in wp_usermeta

 

SELECT * FROM wp_usermeta WHERE user_id = 2775;

meta_id | user_id | meta_key             | meta_value
--------+---------+----------------------+----------------------
  39171 |    2775 | first_name           |
  39172 |    2775 | last_name            |
  39173 |    2775 | nickname             | JonhnDow
  39174 |    2775 | description          |
  39175 |    2775 | rich_editing         | true
  39176 |    2775 | comment_shortcuts    | false
  39177 |    2775 | admin_color          | fresh
  39178 |    2775 | use_ssl              | 0
  39179 |    2775 | show_admin_bar_front | true
  39180 |    2775 | wp_capabilities      | a:1:{s:10:"subscriber
  39181 |    2775 | wp_user_level        | 0
  39182 |    2775 | signup_ip            | 165.231.4.122
  39183 |    2775 | default_password_nag | 1

 

The meta key value that we are looking for is default_password_nag, and are interested to identified users who do not have key, or it is not ‘1’.

=06= Password changers

This SQL statement will list all your WP users who ever changed their default password:

SELECT SQL_CALC_FOUND_ROWS ID, user_login, SUBSTR(meta_value,1,30) 
AS "First Name" FROM wp_users join wp_usermeta on id = user_id AND 
meta_key = 'first_name' WHERE ID not in (select user_id from 
wp_usermeta where meta_key = 'default_password_nag' AND meta_value 
= '1') ORDER BY user_login;
SELECT FOUND_ROWS();

ID   | user_login     | First Name
-----+----------------+------------
 NNN | xxxxxx         | Xxxxxxx
 NNN | yyyyyyy        | Yyyyy

 

As you can see, another potential idea could be to examine all users, who bothered to set their First or Last names fields. But this would not be as reliable. You could be a regular user with short password and with a bank First name field, but it is much harder to be regular user with First name, but with impossible sFkdfj435dkVoox password.

Next, let’s analyze the table storing users comments.

=07= Fields in wp_comments

 

SELECT column_name FROM information_schema.columns WHERE table_name = 
'wp_comments';

column_name
--------------------
comment_ID
comment_post_ID
comment_author
comment_author_email
comment_author_url
comment_author_IP
comment_date
comment_date_gmt
comment_content
comment_karma
comment_approved
comment_agent
comment_type
comment_parent
user_id

 

Looks like we would need to use field comment_author to link to wp_users. Now we are ready to discover all the users, who posted a comment under their login name ever.

Let us start with the list of user ID(s) stored in wp_comments.

=08= Commenting users

 

SELECT user_id, count(*) from wp_comments GROUP BY user_id;

user_id | count(*)
--------+---------
      0 |      NNN
     NN |       NN
    NNN |        N

 

One thing to remember here is that users, who never register write the most comments. For these comments user ID field will be blank.

SELECT 
a.ID, 
a.user_login, 
a.display_name, 
LEFT(MAX(d.comment_date),10) AS "Com Date", MAX(d.comment_id) AS CID, 
(select CONCAT (e.comment_post_ID, ' ', SUBSTR( TRIM( 
e.comment_content ), 1, 5)) from wp_comments e where e.comment_id = 
MAX(d.comment_id)) AS "PID+COMMENT", count(*) AS CNT
FROM 
wp_users a join wp_comments d on ID = user_id
GROUP BY 
a.ID, a.user_login, a.display_name 
ORDER BY 
a.user_login;

 

This well-formatted SQL statement (above) will return these results:

 

ID  | user_login  | display_name   | Com Date   | CID  | PID+COMMENT | CNT
----+-------------+----------------+------------+------+-------------+----
NNN | XXXXX       | Xxxx Xxxxxx    | 20YY-MM-DD | NNNN | NNNN ?????  |   N
NNN | yyyyyyyy    | Yyyyyyyy Yyy   | 20YY-MM-DD | NNNN | NNNN Thank  |  NN
NNN | zzzzz       | Zzzzz Zzzzz    | 20YY-MM-DD | NNNN | NNN Looks   |   N

Any questions?

Com Date – We only output first 10 characters of the date – LEFT(MAX(d.comment_date),10), and only the date for the latest comment made by this user – MAX(d.comment_date).

CID – Comment ID of the latest comment.

PID + Comment – Corresponding post ID (or article ID) and first 5 characters of the comment. Why only 5? To prevent wrapping around.

CNT – Totall count, how many comments the user actually posted.

Now it is time for useless users, namely, users who did not changes their default password, and did not post any comments, and did not participated in any forums:

=09= Useless users

 

SELECT 
a.ID, 
a.user_login,
a.user_email, 
a.display_name,
SUBSTR(a.user_registered, 1, 10) AS REGISTERED,
SUBSTRING_INDEX(SUBSTRING_INDEX(b.meta_value, '"', 2), '"', -1) AS ROLE
FROM 
wp_users               a 
LEFT JOIN  wp_usermeta b on a.ID = b.user_id and b.meta_key = 
'wp_capabilities' 
LEFT JOIN wp_posts     c on a.ID = c.post_author 
LEFT JOIN wp_comments  d on a.ID = d.user_id
WHERE  
a.ID IN (select f.user_id from wp_usermeta f where f.meta_key = 
'default_password_nag' AND f.meta_value = '1') AND c.post_author 
IS NULL AND d.user_id IS NULL
GROUP BY 
a.ID, 
a.user_login, 
a.display_name,
a.user_registered,
b.meta_value 
ORDER BY 
a.user_registered
INTO OUTFILE '/_BACKUP/MySQL.csv' FIELDS TERMINATED BY ',' 
ENCLOSED BY '"' LINES TERMINATED BY '\n';

 

This time we outputting too wide of a string, and we need to output it into a CSV file to be able to read in Excel. That is why you see a new option – INTO OUTFILE.

If we narrow down the output to just 3 columns – UID, login name and email, and dare registered, we will receive output similar to this:

SELECT a.ID, a.user_login, LEFT(a.user_email,20), 
SUBSTR(a.user_registered, 1, 10) AS REGISTERED
FROM wp_users a LEFT JOIN  wp_usermeta b on a.ID = b.user_id and 
b.meta_key = 'wp_capabilities' LEFT JOIN wp_posts c on a.ID = c.post_author 
LEFT JOIN wp_comments d on a.ID = d.user_id
WHERE  a.ID IN (select f.user_id from wp_usermeta f where f.meta_key = 
'default_password_nag' AND f.meta_value = '1') 
AND c.post_author IS NULL AND d.user_id IS NULL;

ID   | user_login      | LEFT(a.user_email,20) | REGISTERED
-----+-----------------+-----------------------+-----------
  NN | NabincCoinioria | llllll@rrrrrr.ru      | 20YY-MM-DD
 NNN | DougThaler      | dddd@ffffffff.com     | 20YY-MM-DD
 NNN | paul253         | sssss@mmmmmmmmmmm.net | 20YY-MM-DD
 NNN | SteveSabo       | ssssss@llllllllllllll | 20YY-MM-DD
NNNN | sebrantigan     | ssssssssss@sssssssss  | 20YY-MM-DD

=10= Delete useless users

Using user ID from the output you can easily delete all these useless users on two steps.  First, delete all related records from wp_usermeta, and then delete all related records from wp_users.  Say, you want to delete all users with ID higher than 84317.  Issue these two simple statements.

First, let’s check, how many records we are about to delete:

SELECT count(*) FROM wp_usermeta WHERE user_id >= 84317;
SELECT count(*) FROM wp_users WHERE ID >= 84317;

Next, let’s just delete all these useless users:

DELETE FROM wp_usermeta WHERE user_id >= 84317;
DELETE FROM wp_users WHERE ID >= 84317;

Hurray.  Instead of spending hours deleting 20 users at a time, we just deleted thousands of robot-created users in a matter of minutes.  I hope some will come up with a more convenient way to delete all this garbage.

 

Finally, let us combine all our knowledge from previous sections, plus some more refinements, and we are ready for this super inclusive SQL statement:

=11= Active users and their activity

 

SELECT * FROM (SELECT "01 User ID", "02 User Login", "03 User Email", 
"04 Display Name", "05 Date Registered", "06 Role", "07 Articles Count", 
"08 Last Forum Post ID + Date + Title", "09 Forum Post Count",
 "10 Last Comment ID + Date + Comment", "11 Comment Count" ) AS U1
UNION ALL 
SELECT * FROM (SELECT a.ID, a.user_login, a.user_email, a.display_name,
SUBSTR(a.user_registered, 1, 10) AS REGISTERED,
SUBSTRING_INDEX(SUBSTRING_INDEX(b.meta_value, '"', 2), '"', -1) AS ROLE,
COUNT(DISTINCT case when c.post_type = 'post' then c.ID else null end) 
AS "Articles Count", (select CONCAT(g.ID, ' - ', left(g.post_date,10),
 ' - ', g.post_title) from wp_posts g where g.ID = MAX(c.ID) and 
g.post_type in ('reply', 'topic')) AS "Forum Post ID - Date - Title",
COUNT(DISTINCT case when c.post_type in ('reply', 'topic') then c.ID 
else null end) AS "Forum Posts Count",
(select CONCAT (e.comment_post_ID, ' - ', left(e.comment_date,10), 
' - ', SUBSTR( TRIM( e.comment_content ), 1, 30)) from wp_comments e 
where e.comment_id = MAX(d.comment_id)) AS "Comment ID + Date + Comment", 
COUNT(DISTINCT d.comment_id) AS "Comment Count"
FROM (select * from wp_users order by user_login) a LEFT JOIN wp_usermeta 
b on a.ID = b.user_id and b.meta_key = 'wp_capabilities' LEFT JOIN 
wp_posts c on a.ID = c.post_author LEFT JOIN wp_comments d on a.ID = 
d.user_id
WHERE  
(a.ID NOT IN (select f.user_id from wp_usermeta f where f.meta_key = 
'default_password_nag' AND f.meta_value = '1') 
OR c.post_type is not null 
OR d.comment_id is not null 
OR SUBSTRING_INDEX(SUBSTRING_INDEX(b.meta_value, '"', 2), '"', -1) 
!= 'subscriber') AND
(c.post_type in ('reply', 'topic', 'post') or c.post_type is null)
GROUP BY 
a.ID, a.user_login, a.display_name, a.user_registered, b.meta_value 
ORDER BY a.user_login ) AS U2
INTO OUTFILE '/_BACKUP/MySQL.csv' FIELDS TERMINATED BY ',' ENCLOSED 
BY '"'  LINES TERMINATED BY '\n';

 

The result of this super statement is a CSV file with 11 columns. Open this file in Excel and examine list of your active users. Also, add all kinds of analytical data to columns 12 and beyond.

Among other things this SQL statement shows how to do following:
– Using union all and order by together
– Using conditional distinct count – COUNT(DISTINCT case when … )
– Using SUBSTRING_INDEX twice to extract substring between two characters from a longer string
– Using LEFT JOIN and comparing value to null
– Output SQL results into CSV file
– Adding column headers to CSV output (using union all)

I would love to hear, if anyone can share a way to make this SQL statement into a PHP page that can be dynamically displayed in your admin panel. This, and useless users list, of course.

Good luck!

(Visited 393 times, 1 visits today)

Be the first to comment

Your question, correction or clarification Ваш вопрос, поправка или уточнение