Ubuntu 12.04 Postgres Access for C

Ubuntu 12.04 Postgres Access for C

Description

This post describes the steps to install the software needed:

  • to be able to access a PostgreSQL database table from C,
  • starting from a DigitalOcean Ubuntu 12.04 Droplet, Virgin Image.

Step 1 Create DigitalOcean Ubuntu 12.04 Droplet

  • Log into DigitalOcean (DO) Droplets
  • Click the “Create Droplet” Button at the top right.
  • Name your Droplet Hostname. Select Size. Select Region.
  • Select Image, On “Distributions Tab”/UbuntuTile_bottom_half, click & select “12.04.05 x32″
  • Click Big Green Button at bottom “Create Droplet”
  • Copy the Droplet ip address for later access.

Step 2 Install PostgreSQL on Ubuntu 12.04

I follow “How To Install and Use PostgreSQL on Ubuntu 12.04″.

root@whatever:~# # logged in as root
root@whatever:~#
root@whatever:~# apt-get update

[**DELETED LINES**]

Reading package lists... Done

root@whatever:~#
root@whatever:~# # download Postgres and its helpful accompanying dependencies
root@whatever:~# sudo apt-get install postgresql postgresql-contrib

Reading package lists... 0%

[**DELETED LINES**]

Moving configuration file /var/lib/postgresql/9.1/main/pg_ident.conf to /etc/postgresql/9.1/main...
Configuring postgresql.conf to use port 5432...
update-alternatives: using /usr/share/postgresql/9.1/man/man1/postmaster.1.gz to provide /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode.
* Starting PostgreSQL 9.1 database server
[ OK ]
Setting up postgresql (9.1+129ubuntu1) ...
Setting up postgresql-contrib-9.1 (9.1.15-0ubuntu0.12.04) ...
Setting up postgresql-contrib (9.1+129ubuntu1) ...
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place

Step 3 Add a new [Linux] User

The "How To Install and Use PostgreSQL on Ubuntu 12.04" article says the next step is to "Create Your PostgreSQL Roles and Databases".

However, first, before that, I will add a new [non root, Linux] User, since this is a virgin image and I don't want to be messing around in root all the time.

I will however, give the new user Admin Privileges.

root@whatever:~# # Joe's first step is to add a new [Linux] user.
root@whatever:~# #     now following https://www.digitalocean.com/community/tutorials/how-to-add-delete-and-grant-sudo- privileges-to-users-on-a-debian-vps
root@whatever:~# #         assume you are currently logged in as the root user, pwd == ~
root@whatever:~#
root@whatever:~# adduser joed   # or whatever username you want

[**DELETED LINES** you get asked a lot of questions in here]

Is the information correct? [Y/n]
root@whatever:~# su - joed   # www.linfo.org/su.html says "The su (short for substitute user)..."
joed@whatever:~$ pwd
/home/joed
joed@whatever:~$ exit  # exit from joed goes back to root
logout
root@whatever:~# pwd
/root

Step 4 Grant [Linux] Users Administrative Privileges

I want the newly added Linux user, "joed", to have Administrative Privileges.

root@whatever:~#
root@whatever:~# # Grant Users Administrative Privileges
root@whatever:~# #    enable sudo privileges
root@whatever:~# #        run the   visudo   command
root@whatever:~# #            i set visudo command editor to be nano  - i hate vi
root@whatever:~# # see http://superuser.com/a/821888/236556
root@whatever:~# # using WinSCP & np++ i edited /etc/sudoers  DIRECTLY, adding the following lines under the Defaults section at the beginning
root@whatever:~# #    Defaults        editor="/usr/bin/nano"
root@whatever:~#
root@whatever:~#
root@whatever:~# # now using visudo [with nano] i add the following line
root@whatever:~# #    joed    ALL=(ALL:ALL) ALL
root@whatever:~# #    under these 2 lines
root@whatever:~# #    # User privilege specification
root@whatever:~# #    root    ALL=(ALL:ALL) ALL
root@whatever:~#
root@whatever:~# #    N.B. i used the visudo command instead of editing it directly with np++
root@whatever:~# #        because i read SOMEPLACE?? that visudo checks the syntax of the lines for correctness
root@whatever:~#
root@whatever:~#
root@whatever:~#
root@whatever:~# # now let's check to see if joed has sudo priveleges
root@whatever:~#
root@whatever:~# su joed
joed@whatever:/root$
joed@whatever:/root$
joed@whatever:/root$ # check if i have sudo priveleges
joed@whatever:/root$     # see http://superuser.com/a/553939/236556
joed@whatever:/root$
joed@whatever:/root$ sudo -v
[sudo] password for joed:
joed@whatever:/root$ sudo -l
Matching Defaults entries for joed on this host:
env_reset, secure_path=/usr/local/sbin\:/usr/local/bin\:/usr/sbin\:/usr/bin\:/sbin\:/bin, editor=/usr/bin/nano

User joed may run the following commands on this host:
(ALL : ALL) ALL
joed@whatever:/root$ exit
exit
root@whatever:~#
root@whatever:~#
root@whatever:~# # OK joed is a new Linux user  &  has sudo priveleges.
root@whatever:~#

Step 5 Make the new Linux User a PostgreSQL "superuser".

Now we "pop the stack" back into "How To Install and Use PostgreSQL on Ubuntu 12.04" at the "Create Your PostgreSQL Roles and Databases" step.

Remember, in the big picture, we are making the Linux user, joed, become a PostgreSQL superuser & accessing a PostgreSQL table from the C language.

root@whatever:~#
root@whatever:~#
root@whatever:~# # Create Your PostgreSQL Roles and Databases
root@whatever:~# #    To begin creating custom users, first switch into the [PostgreSQL] default [Linux] user [== "postgres"]
root@whatever:~# sudo su - postgres      # i BELIEVE i don't need sudo because i am doing this as root but WHY NOT FOLLOW THE DIRECTIONS  :)
postgres@whatever:/root$
postgres@whatever:/root$ cd ~
postgres@whatever:~$
postgres@whatever:~$
postgres@whatever:~$ # OK so we are in the Linux user account named postgres, added by the PostgreSQL install
postgres@whatever:~$ #     now we will do a PostgreSQL createuser command  to make the existing Linux user, joed, be a PostgreSQL superuser
postgres@whatever:~$
postgres@whatever:~$
postgres@whatever:~$ pwd
/var/lib/postgresql
postgres@whatever:~$
postgres@whatever:~$
postgres@whatever:~$ createuser --pwprompt
Enter name of role to add: joed
Enter password for new role:  <enter the pw>
Enter it again: <enter the pw>
Shall the new role be a superuser? (y/n) y
postgres@whatever:~$
postgres@whatever:~$
postgres@whatever:~$
postgres@whatever:~$ # now let's look for joed in the pg database
postgres@whatever:~$
postgres@whatever:~$ psql
psql (9.1.15)
Type "help" for help.

postgres=# \dt
No relations found.
postgres=# \l
List of databases
Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
|          |          |             |             | postgres=CTc/postgres
template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
|          |          |             |             | postgres=CTc/postgres
(3 rows)

postgres=# \du-- show users
postgres=# \du
List of roles
Role name |                   Attributes                   | Member of
-----------+------------------------------------------------+-----------
joed      | Superuser, Create role, Create DB, Replication | {}
postgres  | Superuser, Create role, Create DB, Replication | {}

postgres=# -- N.B. -- is the psql  COMMENT indicator
postgres=# -- list databases & tables for current db  http://dba.stackexchange.com/a/1288
postgres=# --     also above tells how to switch the currnent db
postgres=# -- OK logout of the psql interpreter (\q) & exit back to root Linux user & log back in as postgres_user joed Linux
postgres=# \q
postgres@whatever:~$
postgres@whatever:~$
postgres@whatever:~$ exit
exit
root@whatever:~#
root@whatever:~# # OK joed is a PostgreSQL superuser

Step 6 Create a PostgreSQL Database

I am going to create a PostgreSQL database named "omnia". I will do this as the PostgreSQL user named "postgres".

N.B We are back into "How To Install and Use PostgreSQL on Ubuntu 12.04" at the "Create Your PostgreSQL Roles and [now] Databases" step.

root@whatever:~#
root@whatever:~# # ok [from root] log back in as Linux user == postgres
root@whatever:~# su postgres
postgres@whatever:/root$ cd ~
postgres@whatever:~$ # re-follow https://www.digitalocean.com/community/tutorials/how-to-install-and-use-postgresql-on-ubuntu-12-04
postgres@whatever:~$ # create your first usable postgres database
postgres@whatever:~$ #    i create a named "omnia"
postgres@whatever:~$ createdb omnia
postgres@whatever:~$
postgres@whatever:~$
postgres@whatever:~$ psql
psql (9.1.15)
Type "help" for help.

postgres=# \l
List of databases
Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
omnia     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
|          |          |             |             | postgres=CTc/postgres
template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
|          |          |             |             | postgres=CTc/postgres
(4 rows)

postgres=# -- -- http://stackoverflow.com/questions/3949876/how-to-switch-databases-in-psql
postgres=# \connect omnia
You are now connected to database "omnia" as user "postgres".
omnia=# \q
postgres@whatever:/root$ # let's try this as joed
postgres@whatever:/root$ exit
exit
root@whatever:~# su joed
joed@whatever:/root$ cd ~
joed@whatever:~$ pwd
/home/joed
joed@whatever:~$ # follow https://www.digitalocean.com/community/tutorials/how-to-install-and-use-postgresql-on-ubuntu-12-04
joed@whatever:~$     # log into the correct database==omnia(using the psql -d omnia)
joed@whatever:~$ psql -d omnia
psql (9.1.15)
Type "help" for help.

omnia=# -- YAHOO we're in as joed a postgres superuser
omnia=# \l
List of databases
Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
omnia     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
|          |          |             |             | postgres=CTc/postgres
template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
|          |          |             |             | postgres=CTc/postgres
(4 rows)

omnia=# \du
List of roles
Role name |                   Attributes                   | Member of
-----------+------------------------------------------------+-----------
joed      | Superuser, Create role, Create DB, Replication | {}
postgres  | Superuser, Create role, Create DB, Replication | {}

omnia=# \q
joed@whatever:~$
joed@whatever:~$ pwd
/home/joed

Step 7 Create & Initialize a Table

I will create the table named "test.

I will initialize (load it up) from a .csv file.

I will do this as the [Linux & PostgreSQL] user named "joed".

[joed@whatever trial_test_matrix]$ # now let's make the table named "test" in the "omnia" db
[joed@whatever trial_test_matrix]$ psql -d omnia
psql (9.1.15)
Type "help" for help.

omnia=# CREATE TABLE test(
omnia(#     id INTEGER PRIMARY KEY,
omnia(#     test_ix INTEGER,
omnia(#     origin CHAR(512),
omnia(#     max_prime INTEGER,
omnia(#     primes_tail_0 INTEGER,
omnia(#     primes_tail_1 INTEGER,
omnia(#     primes_tail_2 INTEGER,
omnia(#     primes_tail_3 INTEGER,
omnia(#     primes_tail_4 INTEGER,
omnia(#     primes_tail_5 INTEGER,
omnia(#     primes_tail_6 INTEGER,
omnia(#     primes_tail_7 INTEGER
omnia(# );
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "test_pkey" for table "test"
CREATE TABLE
omnia=# \dt
List of relations
Schema | Name | Type  | Owner
--------+------+-------+-------
public | test | table | joed
(1 row)

omnia=# -- show the column names for test
omnia=# \d test

Table "public.test"
Column     |      Type      | Modifiers
---------------+----------------+-----------
id            | integer        | not null
test_ix       | integer        |
origin        | character(512) |
max_prime     | integer        |
primes_tail_0 | integer        |
primes_tail_1 | integer        |
primes_tail_2 | integer        |
primes_tail_3 | integer        |
primes_tail_4 | integer        |
primes_tail_5 | integer        |
primes_tail_6 | integer        |
primes_tail_7 | integer        |
Indexes:
"test_pkey" PRIMARY KEY, btree (id)

omnia=# \q
[joed@whatever trial_test_matrix]$ # now load up the omnia.test table from init.dbcsv []n]o] ]h]e]a]d]e]r]s]
[joed@whatever trial_test_matrix]$
[joed@whatever trial_test_matrix]$ psql -d omnia
psql (9.1.15)
Type "help" for help.

omnia=# -- now import from csv
omnia=# --    see http://stackoverflow.com/a/2987451/601770
omnia=# --    we will do something like COPY zip_codes FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV;
omnia=# --    COPY zip_codes FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV;test/home/joed/expr/trial_test_matrix/init.csv
omnia=# -- load up the table from init.csv file NO HEADER, 1000 records
omnia=# COPY test FROM '<path to>/init.csv' DELIMITER ',' CSV;
COPY 1000
omnia=# -- yahoo
omnia=#
omnia=# SELECT test_ix, max_prime FROM test WHERE id<20;

test_ix | max_prime
---------+-----------
0 |       840
1 |       980
2 |       900
3 |       900
4 |       830
5 |       950
6 |       990
7 |       800
8 |       960
9 |       820
10 |       910
11 |       930
12 |       920
13 |       950
14 |       960
15 |       820
16 |       830
17 |       880
18 |       830
(19 rows)

omnia=# \q

[joed@whatever trial_test_matrix]$ # OK we have created a table named test
[joed@whatever trial_test_matrix]$ #      AND loaded it up with data from a .csv file

Step 8 Install gcc (the C compiller)

Aparently, gcc does not come with the standard DO Ubuntu image. Who'd 'a thunk it? :)

I'll install it as root.

[joed@whatever SQL]$
[joed@whatever SQL]$ # The tale of "woe" begins here. :)
[joed@whatever SQL]$
[joed@whatever SQL]$ # Try to compile my C program named tse_analogous_db.c
[joed@whatever SQL]$ #    located in the   .../SQL dir
[joed@whatever SQL]$
[joed@whatever SQL]$ gcc -c -I/usr/local/pgsql/include  tse_analogous_db.c
The program 'gcc' can be found in the following packages:
* gcc
* pentium-builder
Ask your administrator to install one of them
[joed@whatever SQL]$
[joed@whatever SQL]$
[joed@whatever SQL]$ # well that does that   -- i have to install gcc
[joed@whatever SQL]$ # see http://askubuntu.com/a/271561/144883
[joed@whatever SQL]$ #     via the toolchain PPA
[joed@whatever SQL]$ #          add the PPA to your system
[joed@whatever SQL]$ # i am going to do this as root
[joed@whatever SQL]$
[joed@whatever SQL]$ exit
root@whatever:~#
root@whatever:~# # OK now let's install gcc according to  http://askubuntu.com/a/271561/144883  - REALLY
root@whatever:~# #    via the toolchain PPA
root@whatever:~# #          add the PPA to your system
root@whatever:~#
root@whatever:~# # First command listed by the link i am following:
root@whatever:~# #    http://askubuntu.com/a/271561/144883
root@whatever:~# #        apt-get install python-software-properties
root@whatever:~#
root@whatever:~# apt-get install python-software-properties

Reading package lists... 0%
[**DELETED LINES**]
After this operation, 651 kB of additional disk space will be used.
Do you want to continue [Y/n]? y
[**DELETED LINES**]
Setting up python-software-properties (0.82.7.7) ...
root@whatever:~#
root@whatever:~#
root@whatever:~# # Next command listed by the link i am following:
root@whatever:~# #    http://askubuntu.com/a/271561/144883
root@whatever:~# #        add-apt-repository ppa:ubuntu-toolchain-r/test
root@whatever:~#
root@whatever:~#
root@whatever:~# add-apt-repository ppa:ubuntu-toolchain-r/test
You are about to add the following PPA to your system:
Toolchain test builds; see https://wiki.ubuntu.com/ToolChain

More info: https://launchpad.net/~ubuntu-toolchain-r/+archive/ubuntu/test
Press [ENTER] to continue or ctrl-c to cancel adding it

gpg: keyring `/tmp/tmpNRzYg4/secring.gpg' created
[**DELETED LINES**]
gpg:               imported: 1  (RSA: 1)
OK
root@whatever:~#
root@whatever:~#
root@whatever:~#
root@whatever:~# # Next command listed by the link i am following:
root@whatever:~# #    http://askubuntu.com/a/271561/144883
root@whatever:~# #        apt-get update
root@whatever:~#
root@whatever:~# apt-get update


0% [Working]
[**DELETED LINES**]
Reading package lists... Done

root@whatever:~#
root@whatever:~#
root@whatever:~#
root@whatever:~#
root@whatever:~# # Next command listed by the link i am following:
root@whatever:~# #    http://askubuntu.com/a/271561/144883
root@whatever:~# #        apt-get install gcc-4.8
root@whatever:~#
root@whatever:~# #    The above link ALSO says
root@whatever:~# #        The latest available version of gcc, for 12.04, is 4.8.1 and is available in the toolchain PPA
root@whatever:~#
root@whatever:~# apt-get install gcc-4.8.1

Reading package lists... 0%
[**DELETED LINES**]
After this operation, 70.6 MB of additional disk space will be used.
Do you want to continue [Y/n]? y
[**DELETED LINES**]
Setting up manpages-dev (3.35-0.1ubuntu1) ...
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place
root@whatever:~#
root@whatever:~#
root@whatever:~#
root@whatever:~# # Next is the FINAL command listed by the link i am following:
root@whatever:~# #    http://askubuntu.com/a/271561/144883
root@whatever:~# #        update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.8 50
root@whatever:~#
root@whatever:~# update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.8 50
root@whatever:~#
root@whatever:~#
root@whatever:~# # OK gcc is now installed :)
root@whatever:~#

Step 9 Install libpq-dev to access PostgreSQL from C

The libpq-dev code library for accessing PostgreSQL from C needs to be installed.

I'll install it as root.

root@whatever:~#
root@whatever:~# su joed
joed@whatever:/root$ # shorten the prompt
[joed@whatever SQL]$
[joed@whatever SQL]$
[joed@whatever SQL]$
[joed@whatever SQL]$ # install libpq-dev
[joed@whatever SQL]$ # Following http://askubuntu.com/a/182062/144883
[joed@whatever SQL]$ # Do this as root
[joed@whatever SQL]$ exit
exit
[root@whatever ~]#
[root@whatever ~]# apt-get install libpq-dev

Reading package lists... 0%
[**DELETED LINES**]
After this operation, 9,328 kB of additional disk space will be used.
Do you want to continue [Y/n]? y
[**DELETED LINES**]
Setting up libpq-dev (9.1.15-0ubuntu0.12.04) ...
Setting up libssl-doc (1.0.1-4ubuntu5.27) ...
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place
[root@whatever ~]#
[root@whatever ~]#
[root@whatever ~]# # OK Looks like libpq-dev is installed.
[root@whatever ~]# # Lets see if we hav been successful
[root@whatever ~]# #    Go back to joed.
[root@whatever ~]#
[root@whatever ~]# su joed
[joed@whatever ~]$ cd [path to SQL dir ie whatever dir has your C code in it]
[joed@whatever SQL]$
[joed@whatever SQL]$
[joed@whatever SQL]$ # Via WinSCP, I changed all permissions recursively to 755 Group: joed [1000] Owner: joed [1000] in [path to]/trial_test_matrix
[joed@whatever SQL]$ # see: http://www.postgresql.org/docs/9.1/interactive/auth-methods.html#AUTH-TRUST
[joed@whatever SQL]$ # see  http://www.postgresql.org/docs/9.3/static/libpq-connect.html
[joed@whatever SQL]$ #     added joed's pw for the omnia db to the connect string in tse_analogous_db.c
[joed@whatever SQL]$
[joed@whatever SQL]$ # OK lets Compile, Link & run tse_analogous_db
[joed@whatever SQL]$
[joed@whatever SQL]$
[joed@whatever SQL]$ # Compile tse_analogous_db.c
[joed@whatever SQL]$ gcc -c  -I/usr/include/postgresql  tse_analogous_db.c
[joed@whatever SQL]$
[joed@whatever SQL]$
[joed@whatever SQL]$ # Link tse_analogous_db.o into tse_analogous_db.exe
[joed@whatever SQL]$ gcc -o tse_analogous_db.exe tse_analogous_db.o -L/usr/lib -lpq
[joed@whatever SQL]$
[joed@whatever SQL]$
[joed@whatever SQL]$ # Run tse_analogous_db.exe
[joed@whatever SQL]$ ./tse_analogous_db.exe
datname        datdba         encoding       datcollate     datctype       datistemplate  datallowconn   datconnlimit   datlastsysoid  datfrozenxid   dattablespace  datacl

template1      10             6              en_US.UTF-8    en_US.UTF-8    t              t              -1             11942          706            1663           {=c/postgres,postgres=CTc/postgres}
template0      10             6              en_US.UTF-8    en_US.UTF-8    t              f              -1             11942          706            1663           {=c/postgres,postgres=CTc/postgres}
postgres       10             6              en_US.UTF-8    en_US.UTF-8    f              t              -1             11942          706            1663
omnia          10             6              en_US.UTF-8    en_US.UTF-8    f              t              -1             11942          706            1663


joe_t1 begin
id             test_ix        origin         max_prime      primes_tail_0  primes_tail_1  primes_tail_2  primes_tail_3  primes_tail_4  primes_tail_5  primes_tail_6  primes_tail_7

17             16             Aberdeen                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        830            0              0              0              0              0              0              0              0
func: func_uses_DAL_sub0, test_index: 0


func_uses_DAL_sub0.PQntuples(res): 1


[joed@whatever SQL]$
[joed@whatever SQL]$ # yahhoo SUCCESS
[joed@whatever SQL]$

Thus Endeth the Lesson :)

Love and peace,

Joe

Analyze Python cProfile stats created with pstats.dump_stats() off line.

Background

In order to improve Code Performance, I need to find functions that are good candidates for Cython implementation.

Here’s the link to the Python Docs for Code Profiling. Pay attention to cProfile.

I asked and answered this on stack overflow here.

Preamble

Previously I have essentially done the following:

import cProfile, pstats, StringIO
pr = cProfile.Profile()
pr.enable()
# ... my code did something ...
pr.disable()
s = StringIO.StringIO()
sortby = 'cumulative'
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)

ps.dump_stats('stats.dmp')  # dump the stats to a file named stats.dmp

So now I have the file named ‘stats.dmp’ stored off line.

Here’s how I used pstats to analyze the ‘stats.dmp’ file for human consumption off line.

The Code

#!/usr/local/bin/python2.7
# -*- coding: UTF-8 -*-
"""analyze_dmp.py takes the file INFILEPATH [a pstats dump file] Producing OUTFILEPATH [a human readable python profile]
Usage:   analyze_dmp.py INFILEPATH  OUTFILEPATH
Example: analyze_dmp.py stats.dmp   stats.log
"""
import sys, os
import cProfile, pstats, StringIO

def analyze_dmp(myinfilepath='stats.dmp', myoutfilepath='stats.log'):
    out_stream = open(myoutfilepath, 'w')
    ps = pstats.Stats(myinfilepath, stream=out_stream)
    sortby = 'cumulative'

    ps.strip_dirs().sort_stats(sortby).print_stats(.3)  # plink around with this to get the results you need

NUM_ARGS = 2
def main():
    args = sys.argv[1:]
    if len(args) != NUM_ARGS or "-h" in args or "--help" in args:
        print __doc__
        s = raw_input('hit return to quit')
        sys.exit(2)
    analyze_dmp(myinfilepath=args[0], myoutfilepath=args[1])

if __name__ == '__main__':
    main()

I tested this with a .dmp file created on Linux & analyzed on Windows XP. It worked FINE. The above Python file is named, “analyze_dmp.py”.

An Additional Resource

I also found out how to use the Python command line to analyze stats.dmp in an interactive way here.

Here’s the command line i used on Windows XP.

ipython -m pstats stats.dmp

lxml.html basic parse methods for [url, file, string]

lxml.html examples of parsing from

  • URLs
  • Files
  • Strings

URLs

import lxml.html
htmltree = lxml.html.parse('http://joecodeswell.com')

htmltree.xpath("//title")[0].text

'''
OUTPUT:
'JoeCodeswell.com'
'''

Files

N.B. Save ‘http://joecodeswell.com&#8217; as a file named ‘JoeCodeswell.com.htm’.
Make sure to cd to the dir containing the file before running the following.

import lxml.html
htmltree = lxml.html.parse('JoeCodeswell.com.htm')

htmltree.xpath("//title")[0].text

'''
OUTPUT:
'JoeCodeswell.com'
'''

Strings

N.B. Save ‘http://joecodeswell.com&#8217; as a file named ‘JoeCodeswell.com.htm’.
Make sure to cd to the dir containing the file before running the following.

import lxml.html

f = open('JoeCodeswell.com.htm', 'r'); the_string = f.read(); f.close()
htmltree = lxml.html.fromstring(the_string)

htmltree.xpath("//title")[0].text

'''
OUTPUT:
'JoeCodeswell.com'
'''

More lxml Syntax Examples

More lxml Syntax Examples

Continued from lxml HTML Scraping Syntax Examples

Content:

  • Python Code
  • Resulting Output

Python Code

#!/usr/local/bin/python2.7
# -*- coding: UTF-8 -*-
"""lxmlScrapingExamplesMore.py takes INURL [URL to an html file] Producing OUTFILEPATH [a scrapped text file]
Usage:   lxmlScrapingExamplesMore.py INURL                                                  OUTFILEPATH
Example: lxmlScrapingExamplesMore.py http://joecodeswell.org/examples/dlwebfiles/acl_attach.htm lxmlScrapingOutput.txt
"""
import sys,os

# joe professional opinion: package structure a bit goofy!   :)
import lxml, lxml.html


def lxmlScrapingExamples(myinurl, myoutfilepath):
    myinurl = 'http://joecodeswell.org/examples/dlwebfiles/acl_attach.htm' # what gets called
    print myinurl
    print myoutfilepath

    #Example 1 redo for myinurl new value 
    print "\n\nExample 1 - basic parsing of url"
    htmltree = lxml.html.parse(myinurl)
    # print "lxml.etree.tostring(htmltree, pretty_print=True) = %s"%(lxml.etree.tostring(htmltree, pretty_print=True))

    #Example 5 - xpath tag with class=value     N.B. backslashes for newLines, etc., DISAPPEAR in WordPress Marldown
    # see http://lxml.de/xpathxslt.html
    print "\n\nExample 5 - xpath tag with class=value"
    print """htmltree.xpath("//h1[@class='title topictitle1']")[0].text = %s"""%(htmltree.xpath("//h1[@class='title topictitle1']")[0].text)
    print """htmltree.xpath("//p[@class='shortdesc']")[0].text = %s"""%(htmltree.xpath("//p[@class='shortdesc']")[0].text)     
    print """len(htmltree.xpath("//var[@class='keyword varname']")) = %s"""%(len(htmltree.xpath("//var[@class='keyword varname']")))
    print """htmltree.xpath("//var[@class='keyword varname']")[0].text = %s"""%(htmltree.xpath("//var[@class='keyword varname']")[0].text)

    #Example 6 - parent   and   ElementVariables with   //  VS  .//   
    print "\n\nExample 6 - parent and ElementVariables"
    print """syntax_div = htmltree.xpath("//h2[@class='title sectiontitle']")[0].getparent() = %s"""%(htmltree.xpath("//h2[@class='title sectiontitle']")[0].getparent())
    syntax_div = htmltree.xpath("//h2[@class='title sectiontitle']")[0].getparent()
    print """syntax_div = %s"""%(syntax_div)
    print syntax_div_2string,'\n'
    print """syntax_div = %s"""%(syntax_div)  

    print "\n\nsyntax_div.xpath     //   VS  .//  \n\n"   
    print "// uses  htmltree"
    print """    syntax_div.xpath("count(//samp)") = %s"""%(syntax_div.xpath("count(//samp)"))  
    print '            equals\n'
    print """    htmltree.xpath("count(//samp)") = %s"""%(htmltree.xpath("count(//samp)"))  
    print """    syntax_div.xpath("count(//var)") = %s"""%(syntax_div.xpath("count(//var)"))  
    print '            equals'
    print """    htmltree.xpath("count(//var)") = %s"""%(htmltree.xpath("count(//var)"))  
    print '\nVS   .// uses  syntax_div ONLY'
    print """    htmltree.xpath("count(.//samp)") = %s"""%(htmltree.xpath("count(.//samp)"))  
    print """    syntax_div.xpath("count(.//var)") = %s"""%(syntax_div.xpath("count(.//var)")) 
    print "\n"    
    print syntax_div_ipython_discovery

    #Example 7 - xpath select element by text
    print "\n\nExample 7 - xpath select element by text"
    print """description_div = htmltree.xpath("//h2[@class='title sectiontitle' and text()='Description']")[0].getparent() = %s"""%(htmltree.xpath("//h2[@class='title sectiontitle' and text()='Description']")[0].getparent())
    description_div = htmltree.xpath("//h2[@class='title sectiontitle' and text()='Description']")[0].getparent()
    print """description_div = %s"""%(description_div)
    print description_div_2string,'\n'
    print """description_div.xpath("./p")[0].text = %s"""%(description_div.xpath("./p")[0].text)


    #Example 8 - get all text in an element
    print "\n\nExample 8 - get all text in element\nsee http://lxml.de/lxmlhtml.html#html-element-methods"
    print """xample_div = htmltree.xpath("//h2[@class='title sectiontitle' and text()='Example']")[0].getparent() = %s"""%(htmltree.xpath("//h2[@class='title sectiontitle' and text()='Example']")[0].getparent())    
    example_div = htmltree.xpath("//h2[@class='title sectiontitle' and text()='Example']")[0].getparent()
    print example_div_2string,'\n'
    print "example_div.text_content() = %s"%(example_div.text_content())


    #Example 9 - zip/dict   data terms & data definitions
    print "\n\nExample 9 - zipping data terms & data definitions"
    print """options_div = htmltree.xpath("//h2[@class='title sectiontitle' and text()='Options']")[0].getparent() = %s"""%(htmltree.xpath("//h2[@class='title sectiontitle' and text()='Options']")[0].getparent())    
    options_div = htmltree.xpath("//h2[@class='title sectiontitle' and text()='Options']")[0].getparent()
    print options_div_2string
    terms = [t.text for t in options_div.xpath("dl/dt/samp/var")]
    defs  = [d.text for d in options_div.xpath("dl/dd")]
    term_def_dict = dict(zip(terms, defs))
    print '\nterm_def_dict'
    for k, v in term_def_dict.iteritems():
        print '    %s: %s'%(k,v)


#print lxml.etree.tostring(options_div, pretty_print=True)

options_div_2string = """<div class="section">
  <h2 class="title sectiontitle">Options</h2>
  <dl class="dl">
    <dt class="dt dlterm">
      <samp class="ph codeph">
        <var class="keyword varname">acl_name</var>
      </samp>
    </dt>
    <dd class="dd">Specifies the ACL policy that is applied to the named object.
      The ACL policy must exist, or an error is displayed. 
      <p class="p">Examples of
      the ACL names are 
        <samp class="ph codeph">default-root</samp>, 
        <samp class="ph codeph">test</samp>, 
        <samp class="ph codeph">default-management</samp>,
        and 
        <samp class="ph codeph">pubs_acl3</samp>.</p>
    </dd>
    <dt class="dt dlterm">
      <samp class="ph codeph">
        <var class="keyword varname">object_name</var>
      </samp>
    </dt>
    <dd class="dd">Specifies the object to which to apply the named ACL policy. The
    object name must exist, or an error is displayed. 
      <p class="p">Examples of object
      names are:
      </p>
      <ul class="ul">
        <li class="li">
          <samp class="ph codeph">/Management/Groups/Travel</samp>
        </li>
        <li class="li">
          <samp class="ph codeph">/WebSEAL</samp>
        </li>
        <li class="li">
          <samp class="ph codeph">/Management</samp>
        </li>
      </ul>
    </dd>
  </dl>
</div>"""



example_div_2string = """<div class="example">
  <h2 class="title sectiontitle">Example</h2>
  <div class="p">The following example attaches the ACL policy, 
    <samp class="ph codeph">pubs_acl3</samp>, 
    to the protected object, 
    <samp class="ph codeph">/Management</samp>: 
    <pre class="pre codeblock">
      <code>pdadmin sec_master> acl attach /Management pubs_acl3</code>
    </pre>
  </div>
</div>
"""    

description_div_2string = """<div class="section">
  <h2 class="title sectiontitle">Syntax</h2>
  <p class="p">
    <span class="keyword cmdname">acl attach</span>
    <samp class="ph codeph">
      <var class="keyword varname">object_name</var></samp> 
    <samp class="ph codeph">
      <var class="keyword varname">acl_name</var>
    </samp>
  </p>
 </div>"""

syntax_div_2string = """<div class="section">
  <h2 class="title sectiontitle">Syntax</h2>
  <p class="p">
    <span class="keyword cmdname">acl attach</span> 
    <samp class="ph codeph">
      <var class="keyword varname">object_name</var>
    </samp> 
    <samp class="ph codeph">
      <var class="keyword varname">acl_name</var>
    </samp>
  </p>
 </div>"""

syntax_div_ipython_discovery = """In [54]: syntax_div.
syntax_div.addnext             syntax_div.get_element_by_id   syntax_div.keys
syntax_div.addprevious         syntax_div.getchildren         syntax_div.label
syntax_div.append              syntax_div.getiterator         syntax_div.make_links_absolut
syntax_div.attrib              syntax_div.getnext             syntax_div.makeelement
syntax_div.base                syntax_div.getparent           syntax_div.nsmap
syntax_div.base_url            syntax_div.getprevious         syntax_div.prefix
syntax_div.body                syntax_div.getroottree         syntax_div.remove
syntax_div.clear               syntax_div.head                syntax_div.replace
syntax_div.cssselect           syntax_div.index               syntax_div.resolve_base_href
syntax_div.drop_tag            syntax_div.insert              syntax_div.rewrite_links
syntax_div.drop_tree           syntax_div.items               syntax_div.set
syntax_div.extend              syntax_div.iter                syntax_div.sourceline
syntax_div.find                syntax_div.iterancestors       syntax_div.tag
syntax_div.find_class          syntax_div.iterchildren        syntax_div.tail
syntax_div.find_rel_links      syntax_div.iterdescendants     syntax_div.text
syntax_div.findall             syntax_div.iterfind            syntax_div.text_content
syntax_div.findtext            syntax_div.iterlinks           syntax_div.values
syntax_div.forms               syntax_div.itersiblings        syntax_div.xpath
syntax_div.get                 syntax_div.itertext
"""

NUM_ARGS = 2
def main():
    args = sys.argv[1:]
    if len(args) != NUM_ARGS or "-h" in args or "--help" in args:
        print __doc__
        s = raw_input('hit return to quit')
        sys.exit(2)
    lxmlScrapingExamples(args[0], args[1])

if __name__ == '__main__':
    main()

Resulting Output

>lxmlScrapingExamplesMore.py http://joecodeswell.org/examples/dlwebfiles/acl_attach.htm lxmlScrapingOutput.txt
http://joecodeswell.org/examples/dlwebfiles/acl_attach.htm
lxmlScrapingOutput.txt


Example 1 - basic parsing of url


Example 5 - xpath tag with class=value
htmltree.xpath("//h1[@class='title topictitle1']")[0].text = acl attach
htmltree.xpath("//p[@class='shortdesc']")[0].text = Attaches an ACL policy to a protected object. If the protected
object already has an ACL attached, the ACL is replaced with a new
one. 
len(htmltree.xpath("//var[@class='keyword varname']")) = 4
htmltree.xpath("//var[@class='keyword varname']")[0].text = object_name


Example 6 - parent and ElementVariables
syntax_div = htmltree.xpath("//h2[@class='title sectiontitle']")[0].getparent() = <Element div at 0xb7df00>
syntax_div = <Element div at 0xb7df00>
<div class="section">
  <h2 class="title sectiontitle">Syntax</h2>
  <p class="p">
    <span class="keyword cmdname">acl attach</span> 
    <samp class="ph codeph">
      <var class="keyword varname">object_name</var>
    </samp> 
    <samp class="ph codeph">
      <var class="keyword varname">acl_name</var>
    </samp>
  </p>
 </div> 

syntax_div = <Element div at 0xb7df00>


syntax_div.xpath     //   VS  .//  


// uses  htmltree
    syntax_div.xpath("count(//samp)") = 14.0
            equals

    htmltree.xpath("count(//samp)") = 14.0
    syntax_div.xpath("count(//var)") = 4.0
            equals
    htmltree.xpath("count(//var)") = 4.0

VS   .// uses  syntax_div ONLY
    htmltree.xpath("count(.//samp)") = 14.0
    syntax_div.xpath("count(.//var)") = 2.0


In [54]: syntax_div.
syntax_div.addnext             syntax_div.get_element_by_id   syntax_div.keys
syntax_div.addprevious         syntax_div.getchildren         syntax_div.label
syntax_div.append              syntax_div.getiterator         syntax_div.make_links_absolut
syntax_div.attrib              syntax_div.getnext             syntax_div.makeelement
syntax_div.base                syntax_div.getparent           syntax_div.nsmap
syntax_div.base_url            syntax_div.getprevious         syntax_div.prefix
syntax_div.body                syntax_div.getroottree         syntax_div.remove
syntax_div.clear               syntax_div.head                syntax_div.replace
syntax_div.cssselect           syntax_div.index               syntax_div.resolve_base_href
syntax_div.drop_tag            syntax_div.insert              syntax_div.rewrite_links
syntax_div.drop_tree           syntax_div.items               syntax_div.set
syntax_div.extend              syntax_div.iter                syntax_div.sourceline
syntax_div.find                syntax_div.iterancestors       syntax_div.tag
syntax_div.find_class          syntax_div.iterchildren        syntax_div.tail
syntax_div.find_rel_links      syntax_div.iterdescendants     syntax_div.text
syntax_div.findall             syntax_div.iterfind            syntax_div.text_content
syntax_div.findtext            syntax_div.iterlinks           syntax_div.values
syntax_div.forms               syntax_div.itersiblings        syntax_div.xpath
syntax_div.get                 syntax_div.itertext



Example 7 - xpath select element by text
description_div = htmltree.xpath("//h2[@class='title sectiontitle' and text()='Description']")[0].getparent() = <Element div at 0xd10e40>
description_div = <Element div at 0xd10e40>
<div class="section">
  <h2 class="title sectiontitle">Syntax</h2>
  <p class="p">
    <span class="keyword cmdname">acl attach</span>
    <samp class="ph codeph">
      <var class="keyword varname">object_name</var></samp> 
    <samp class="ph codeph">
      <var class="keyword varname">acl_name</var>
    </samp>
  </p>
 </div> 

description_div.xpath("./p")[0].text = At most, one ACL can be attached
to a given protected object. The same ACL can be attached to multiple
protected objects. Ensure that you are familiar with ACL management before you
use this function.


Example 8 - get all text in element
see http://lxml.de/lxmlhtml.html#html-element-methods
xample_div = htmltree.xpath("//h2[@class='title sectiontitle' and text()='Example']")[0].getparent() = <Element div at 0xd10e70>
<div class="example">
  <h2 class="title sectiontitle">Example</h2>
  <div class="p">The following example attaches the ACL policy, 
    <samp class="ph codeph">pubs_acl3</samp>, 
    to the protected object, 
    <samp class="ph codeph">/Management</samp>: 
    <pre class="pre codeblock">
      <code>pdadmin sec_master> acl attach /Management pubs_acl3</code>
    </pre>
  </div>
</div>


example_div.text_content() = ExampleThe following example attaches the
ACL policy, pubs_acl3, to the protected object, /Management: pdadmin sec_master> acl attach /Management pubs_acl3




Example 9 - zipping data terms & data definitions
options_div = htmltree.xpath("//h2[@class='title sectiontitle' and text()='Options']")[0].getparent() = <Element div at 0xd10d80>
<div class="section">
  <h2 class="title sectiontitle">Options</h2>
  <dl class="dl">
    <dt class="dt dlterm">
      <samp class="ph codeph">
        <var class="keyword varname">acl_name</var>
      </samp>
    </dt>
    <dd class="dd">Specifies the ACL policy that is applied to the named object.
      The ACL policy must exist, or an error is displayed. 
      <p class="p">Examples of
      the ACL names are 
        <samp class="ph codeph">default-root</samp>, 
        <samp class="ph codeph">test</samp>, 
        <samp class="ph codeph">default-management</samp>,
        and 
        <samp class="ph codeph">pubs_acl3</samp>.</p>
    </dd>
    <dt class="dt dlterm">
      <samp class="ph codeph">
        <var class="keyword varname">object_name</var>
      </samp>
    </dt>
    <dd class="dd">Specifies the object to which to apply the named ACL policy. The
    object name must exist, or an error is displayed. 
      <p class="p">Examples of object
      names are:
      </p>
      <ul class="ul">
        <li class="li">
          <samp class="ph codeph">/Management/Groups/Travel</samp>
        </li>
        <li class="li">
          <samp class="ph codeph">/WebSEAL</samp>
        </li>
        <li class="li">
          <samp class="ph codeph">/Management</samp>
        </li>
      </ul>
    </dd>
  </dl>
</div>

term_def_dict
    object_name: Specifies the object to which to apply the named ACL policy. The
object name must exist, or an error is displayed. 
    acl_name: Specifies the ACL policy that is applied to the named object.
The ACL policy must exist, or an error is displayed. 

>

lxml HTML Scraping Syntax Examples

lxml Syntax Examples

Content:

  • Python Code
  • Resulting Output

Python Code

#!/usr/local/bin/python2.7
# -*- coding: UTF-8 -*-
"""lxmlScrapingExamples.py takes INURL [URL to an html file] Producing OUTFILEPATH [a scrapped text file]
Usage:   lxmlScrapingExamples.py INURL                                                  OUTFILEPATH
Example: lxmlScrapingExamples.py http://joecodeswell.org/examples/dlwebfiles/htmlExample.html lxmlScrapingOutput.txt
"""
import sys

# joe professional opinion: package structure a bit goofy!   :)
import lxml, lxml.html


def lxmlScrapingExamples(myinurl, myoutfilepath):
    print myinurl
    print myoutfilepath

    #Example 1 - basic parsing of url - slightly altered from: http://stackoverflow.com/a/14303564/601770
    print "\n\nExample 1 - basic parsing of url"
    htmltree = lxml.html.parse(myinurl)
    print "lxml.etree.tostring(htmltree, pretty_print=True) = %s"%(lxml.etree.tostring(htmltree, pretty_print=True))



    #Example 2 - syntax examples [css_selector, xpath] - slightly altered from: http://stackoverflow.com/a/603630/601770
    print "\n\nExample 2 - syntax examples [css_selector, xpath]"
    # joe comment - i don't know why htmltree DOESN'T WORK DIRECTLY in this example it generates error:
    #     more lxml package/module/class/function assymetry?
    '''
    File "C:\1d\PythonPjs\kivyPjs\IBMsecurityAPIclientsPj\IBMsecurityAPIclient\ngExamples.py", line 28, in lxmlScrapingExamples
        for a in mySearchTree.cssselect('tr a'):
    AttributeError: 'lxml.etree._ElementTree' object has no attribute 'cssselect'    
    '''
    #mySearchTree = htmltree
    mySearchTree = lxml.html.fromstring(lxml.etree.tostring(htmltree))         
    # Find all 'a' elements inside 'tr' table rows with css selector
    print "Find all 'a' elements inside 'tr' table rows with css selector"
    for itm in mySearchTree.cssselect('tr a'):
        print 'found "%s" link to href "%s"' % (itm.text, itm.get('href'))    
    # Find all 'a' elements inside 'tr' table rows with xpath
    print "Find all 'a' elements inside 'tr' table rows with xpath"
    for itm in mySearchTree.xpath('.//tr/*/a'):
        print 'found "%s" link to href "%s"' % (itm.text, itm.get('href'))

    #Example 3 - syntax examples [xpath, .findall(), .getchildren()] - slightly altered from: http://stackoverflow.com/a/9920703/601770
    print "\n\nExample 3 - syntax examples [xpath, .findall(), .getchildren()] "
    page = htmltree
    rows = page.xpath("body/table")[1].findall("tr")   # table [1] is the 2nd table in MY example html
    data = list()
    for row in rows:
        data.append([c.text for c in row.getchildren()])
    for itm in data[4:]: print(itm)

    #Example 4 - following sibling [] - slightly altered from: http://stackoverflow.com/questions/3139402/how-to-select-following-sibling-xml-tag-using-xpath
    print "\n\nExample 4 - following sibling []"
    sibEx = '''
    <html>
    <head>
    <title>following sibling</title>
    </head>
    <body>
    <table border>    
    <tr>
        <td class="name">Brand</td>
        <td class="desc">Intel</td>
    </tr>
    <tr>
        <td class="name">Series</td>
        <td class="desc">Core i5</td>
    </tr>
    <tr>
        <td class="name">Cores</td>
        <td class="desc">4</td>
    </tr>
    <tr>
        <td class="name">Socket</td>
        <td class="desc">LGA 1156</td>    
    </tr>

    <tr>
        <td class="name">Brand</td>
        <td class="desc">AMD</td>
    </tr>
    <tr>
        <td class="name">Series</td>
        <td class="desc">Phenom II X4</td>
    </tr>
    <tr>
        <td class="name">Cores</td>
        <td class="desc">4</td>
    </tr>
    <tr>
        <td class="name">Socket</td>
        <td class="desc">Socket AM3</td>
    </tr>
    </table>
    </body>
    </html>    
    '''
    parsedDocument = lxml.html.fromstring(sibEx)

    # bad
    #rlist = parsedDocument.xpath("tr[td[@class='name'] ='Brand']")
    #rlist = parsedDocument.xpath("tr[td[@class='name'] ='Brand']/td[@class='desc']")
    #r = parsedDocument.xpath(tr/td[@class="name"])=='Brand')
    # r = parsedDocument.tr[td[@class='name'] ='Brand'].text
    #r = parsedDocument.tr[td[@class='name'] ='Brand']/td[@class='desc'].text
    #if(parsedDocument.xpath(tr/td[@class="name"])=='Brand'):

    # good
    #print "parsedDocument.xpath('/html/body/table/tr') = %s"%(parsedDocument.xpath('/html/body/table/tr'))
    print """parsedDocument.xpath("//tr[td[@class='name'] ='Brand']/td[@class='desc']") = %s"""%(parsedDocument.xpath("//tr[td[@class='name'] ='Brand']/td[@class='desc']"))
    print """parsedDocument.xpath("//tr[td[@class='name'] ='Brand']/td[@class='desc']")[0].text = %s"""%(parsedDocument.xpath("//tr[td[@class='name'] ='Brand']/td[@class='desc']")[0].text)



    print '\n\n\n'


NUM_ARGS = 2
def main():
    args = sys.argv[1:]
    if len(args) != NUM_ARGS or "-h" in args or "--help" in args:
        print __doc__
        s = raw_input('hit return to quit')
        sys.exit(2)
    lxmlScrapingExamples(args[0], args[1])

if __name__ == '__main__':
    main()

Resulting Output

>lxmlScrapingExamples.py http://joecodeswell.org/examples/dlwebfiles/htmlExample.html lxmlScrapingOutput.txt
http://joecodeswell.org/examples/dlwebfiles/htmlExample.html
lxmlScrapingOutput.txt


Example 1 - basic parsing of url
lxml.etree.tostring(htmltree, pretty_print=True) = <!DOCTYPE html>
<html>
  <head>
    <meta http-equiv="content-type" content="text/html; charset=windows-1252"/>
    <title>lxml htmlExamples.html</title>
  </head>
  <body>
    <h1>lxml htmlExamples.html for Joe Codeswell examples - dlwebfiles</h1>

    <h2>Example 1</h2>
    <ul><li><a href="http://joecodeswell.org/examples/dlwebfiles/aveverum.mid">aveverum.mid</a></li>
      <li><a href="http://joecodeswell.org/examples/dlwebfiles/carol.mid">carol.mid</a></li>
      <li><a href="http://joecodeswell.org/examples/dlwebfiles/steiner.mid">steiner.mid</a></li>
    </ul><h2>Example 2</h2>
    <table align="left" border="0" cellspacing="0" cellpadding="0" width="100%"><tr align="left" valign="top"><th>Name</th>
        <th>File Name & Link</th>
      </tr><tr align="left" valign="top"><td>Ave Verum</td><td><a href="http://joecodeswell.org/examples/dlwebfiles/aveverum.mid">aveverum.mid</a></td></tr><tr align="left" valign="top"><td>A Carol</td><td><a href="http://joecodeswell.org/examples/dlwebfiles/carol.mid.mid">carol.mid</a></td></tr><tr align="left" valign="top"><td>Steiner Amen?</td><td><a href="http://joecodeswell.org/examples/dlwebfiles/steiner.mid">steiner.mid</a></td></tr></table><h2>Example 3</h2>
    <table border=""><tr align="LEFT"><th colspan="38">Main Subject</th>
    </tr><tr align="LEFT"><th colspan="2"> </th>

    <th valign="TOP" colspan="18">part1</th>
    <th valign="TOP" colspan="18">part2</th>
    </tr><tr align="LEFT"><th colspan="2"> </th>
    <th valign="TOP" colspan="9">sub-part1</th>
    <th valign="TOP" colspan="9">sub-part2</th>
    <th valign="TOP" colspan="9">sub-part3</th>
    <th valign="TOP" colspan="9">sub-part4</th>
    </tr><tr align="LEFT"><th colspan="2"> </th>
    <th valign="TOP" colspan="1">subject1</th>
    <th valign="TOP" colspan="1">subject2</th>

    <th valign="TOP" colspan="1">subject10</th>
    <th valign="TOP" colspan="1">subject11</th>
    <th valign="TOP" colspan="1">subject12</th>
    <th valign="TOP" colspan="1">subject13</th>
    <th valign="TOP" colspan="1">subject14</th>
    <th valign="TOP" colspan="1">subject15</th>
    <th valign="TOP" colspan="1">subject16</th>

    <th valign="TOP" colspan="1">subject17</th>
    <th valign="TOP" colspan="1">subject18</th>
    <th valign="TOP" colspan="1">subject19</th>
    <th valign="TOP" colspan="1">subject20</th>
    <th valign="TOP" colspan="1">subject21</th>
    <th valign="TOP" colspan="1">subject22</th>
    <th valign="TOP" colspan="1">subject23</th>
    <th valign="TOP" colspan="1">subject24</th>
    <th valign="TOP" colspan="1">subject25</th>

    <th valign="TOP" colspan="1">subject26</th>
    <th valign="TOP" colspan="1">subject27</th>
    <th valign="TOP" colspan="1">subject28</th>
    <th valign="TOP" colspan="1">subject29</th>
    <th valign="TOP" colspan="1">subject30</th>
    <th valign="TOP" colspan="1">subject31</th>
    <th valign="TOP" colspan="1">subject32</th>
    <th valign="TOP" colspan="1">subject33</th>
    <th valign="TOP" colspan="1">subject34</th>

    <th valign="TOP" colspan="1">subject35</th>
    <th valign="TOP" colspan="1">subject36</th>
    </tr><tr align="RIGHT"><th align="LEFT" valign="TOP" rowspan="12">2050</th>
    <th align="LEFT">January</th>
    <td>0</td>
    <td>1</td>
    <td>3</td>
    <td>0</td>

    <td>4</td>
    <td>16</td>
    <td>0</td>
    <td>6</td>
    <td>2</td>
    <td>2</td>
    <td>0</td>
    <td>3</td>
    <td>0</td>

    <td>3</td>
    <td>2</td>
    <td>0</td>
    <td>26</td>
    <td>1</td>
    <td>0</td>
    <td>0</td>
    <td>7</td>
    <td>0</td>

    <td>5</td>
    <td>6</td>
    <td>0</td>
    <td>8</td>
    <td>2</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>

    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>2</td>
    <td>0</td>
    </tr><tr align="RIGHT"><th align="LEFT">February</th>
    <td>1</td>
    <td>0</td>

    <td>8</td>
    <td>0</td>
    <td>2</td>
    <td>4</td>
    <td>1</td>
    <td>6</td>
    <td>1</td>
    <td>2</td>
    <td>0</td>

    <td>3</td>
    <td>0</td>
    <td>0</td>
    <td>4</td>
    <td>0</td>
    <td>25</td>
    <td>0</td>
    <td>0</td>
    <td>1</td>

    <td>2</td>
    <td>0</td>
    <td>4</td>
    <td>14</td>
    <td>1</td>
    <td>1</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>

    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>1</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    </tr><tr align="RIGHT"><th align="LEFT">March</th>

    <td>0</td>
    <td>0</td>
    <td>4</td>
    <td>0</td>
    <td>4</td>
    <td>7</td>
    <td>0</td>
    <td>9</td>
    <td>2</td>

    <td>1</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>2</td>
    <td>9</td>
    <td>0</td>
    <td>45</td>
    <td>1</td>

    <td>0</td>
    <td>0</td>
    <td>7</td>
    <td>0</td>
    <td>10</td>
    <td>16</td>
    <td>0</td>
    <td>5</td>
    <td>1</td>

    <td>1</td>
    <td>0</td>
    <td>1</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>4</td>
    <td>0</td>

    </tr><tr align="RIGHT"><th align="LEFT">April</th>
    <td>1</td>
    <td>0</td>
    <td>5</td>
    <td>0</td>
    <td>3</td>
    <td>12</td>
    <td>1</td>

    <td>11</td>
    <td>0</td>
    <td>3</td>
    <td>0</td>
    <td>3</td>
    <td>0</td>
    <td>0</td>
    <td>3</td>
    <td>2</td>

    <td>34</td>
    <td>0</td>
    <td>0</td>
    <td>1</td>
    <td>2</td>
    <td>0</td>
    <td>6</td>
    <td>18</td>
    <td>1</td>

    <td>3</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>

    <td>5</td>
    <td>1</td>
    </tr><tr align="RIGHT"><th align="LEFT">May</th>
    <td>7</td>
    <td>0</td>
    <td>6</td>
    <td>0</td>
    <td>8</td>

    <td>4</td>
    <td>1</td>
    <td>13</td>
    <td>0</td>
    <td>0</td>
    <td>2</td>
    <td>2</td>
    <td>0</td>
    <td>1</td>

    <td>7</td>
    <td>1</td>
    <td>30</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>7</td>
    <td>0</td>
    <td>5</td>

    <td>12</td>
    <td>0</td>
    <td>4</td>
    <td>1</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>

    <td>0</td>
    <td>0</td>
    <td>6</td>
    <td>1</td>
    </tr><tr align="RIGHT"><th align="LEFT">June</th>
    <td>0</td>
    <td>1</td>
    <td>14</td>

    <td>0</td>
    <td>7</td>
    <td>15</td>
    <td>0</td>
    <td>17</td>
    <td>1</td>
    <td>2</td>
    <td>0</td>
    <td>5</td>

    <td>0</td>
    <td>1</td>
    <td>3</td>
    <td>0</td>
    <td>24</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>5</td>

    <td>0</td>
    <td>6</td>
    <td>13</td>
    <td>1</td>
    <td>9</td>
    <td>1</td>
    <td>1</td>
    <td>0</td>
    <td>0</td>

    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>2</td>
    <td>1</td>
    </tr><tr align="RIGHT"><th align="LEFT">July</th>
    <td>0</td>

    <td>1</td>
    <td>6</td>
    <td>0</td>
    <td>8</td>
    <td>17</td>
    <td>1</td>
    <td>15</td>
    <td>2</td>
    <td>1</td>

    <td>0</td>
    <td>10</td>
    <td>0</td>
    <td>2</td>
    <td>15</td>
    <td>2</td>
    <td>53</td>
    <td>0</td>
    <td>3</td>

    <td>3</td>
    <td>6</td>
    <td>0</td>
    <td>7</td>
    <td>16</td>
    <td>0</td>
    <td>9</td>
    <td>1</td>
    <td>1</td>

    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>1</td>
    <td>0</td>
    <td>2</td>
    <td>0</td>
    </tr><tr align="RIGHT"><th align="LEFT">August</th>
    <td>2</td>
    <td>0</td>
    <td>5</td>
    <td>0</td>
    <td>8</td>
    <td>15</td>
    <td>1</td>

    <td>17</td>
    <td>0</td>
    <td>2</td>
    <td>0</td>
    <td>2</td>
    <td>0</td>
    <td>5</td>
    <td>16</td>
    <td>0</td>

    <td>33</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>11</td>
    <td>0</td>
    <td>2</td>
    <td>25</td>
    <td>4</td>

    <td>8</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>1</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>

    <td>3</td>
    <td>0</td>
    </tr><tr align="RIGHT"><th align="LEFT">September</th>
    <td>2</td>
    <td>0</td>
    <td>10</td>
    <td>0</td>
    <td>16</td>

    <td>22</td>
    <td>2</td>
    <td>19</td>
    <td>4</td>
    <td>2</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>2</td>

    <td>8</td>
    <td>0</td>
    <td>27</td>
    <td>0</td>
    <td>1</td>
    <td>0</td>
    <td>8</td>
    <td>0</td>
    <td>11</td>

    <td>31</td>
    <td>1</td>
    <td>9</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>1</td>
    <td>0</td>
    <td>0</td>

    <td>0</td>
    <td>1</td>
    <td>1</td>
    <td>0</td>
    </tr><tr align="RIGHT"><th align="LEFT">October</th>
    <td>3</td>
    <td>1</td>
    <td>8</td>

    <td>0</td>
    <td>4</td>
    <td>28</td>
    <td>0</td>
    <td>15</td>
    <td>2</td>
    <td>1</td>
    <td>0</td>
    <td>1</td>

    <td>0</td>
    <td>1</td>
    <td>6</td>
    <td>0</td>
    <td>15</td>
    <td>0</td>
    <td>1</td>
    <td>0</td>
    <td>3</td>

    <td>0</td>
    <td>9</td>
    <td>26</td>
    <td>1</td>
    <td>8</td>
    <td>4</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>

    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>1</td>
    <td>0</td>
    </tr><tr align="RIGHT"><th align="LEFT">November</th>
    <td>0</td>

    <td>3</td>
    <td>3</td>
    <td>0</td>
    <td>6</td>
    <td>23</td>
    <td>1</td>
    <td>8</td>
    <td>1</td>
    <td>2</td>

    <td>0</td>
    <td>1</td>
    <td>0</td>
    <td>3</td>
    <td>7</td>
    <td>1</td>
    <td>20</td>
    <td>0</td>
    <td>0</td>

    <td>0</td>
    <td>8</td>
    <td>0</td>
    <td>3</td>
    <td>18</td>
    <td>3</td>
    <td>7</td>
    <td>0</td>
    <td>0</td>

    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>3</td>
    <td>0</td>
    </tr><tr align="RIGHT"><th align="LEFT">December</th>
    <td>1</td>
    <td>0</td>
    <td>4</td>
    <td>0</td>
    <td>4</td>
    <td>13</td>
    <td>2</td>

    <td>15</td>
    <td>1</td>
    <td>0</td>
    <td>0</td>
    <td>2</td>
    <td>0</td>
    <td>1</td>
    <td>2</td>
    <td>0</td>

    <td>29</td>
    <td>0</td>
    <td>1</td>
    <td>0</td>
    <td>7</td>
    <td>0</td>
    <td>3</td>
    <td>20</td>
    <td>1</td>

    <td>13</td>
    <td>0</td>
    <td>1</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>

    <td>3</td>
    <td>0</td>
    </tr></table></body>
</html>



Example 2 - syntax examples [css_selector, xpath]
Find all 'a' elements inside 'tr' table rows with css selector
found "aveverum.mid" link to href "http://joecodeswell.org/examples/dlwebfiles/aveverum.mid"
found "carol.mid" link to href "http://joecodeswell.org/examples/dlwebfiles/carol.mid.mid"
found "steiner.mid" link to href "http://joecodeswell.org/examples/dlwebfiles/steiner.mid"
Find all 'a' elements inside 'tr' table rows with xpath
found "aveverum.mid" link to href "http://joecodeswell.org/examples/dlwebfiles/aveverum.mid"
found "carol.mid" link to href "http://joecodeswell.org/examples/dlwebfiles/carol.mid.mid"
found "steiner.mid" link to href "http://joecodeswell.org/examples/dlwebfiles/steiner.mid"


Example 3 - syntax examples [xpath, .findall(), .getchildren()] 
['2050', 'January', '0', '1', '3', '0', '4', '16', '0', '6', '2', '2', '0', '3', '0', '3', '2', '0', '26', '1', '0', '0', '7', '0', '5', '6', '0', '8', '2', '0', '0', '0', '0', '0', '0', '0', '2', '0']
['February', '1', '0', '8', '0', '2', '4', '1', '6', '1', '2', '0', '3', '0', '0', '4', '0', '25', '0', '0', '1', '2', '0', '4', '14', '1', '1', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0']
['March', '0', '0', '4', '0', '4', '7', '0', '9', '2', '1', '0', '0', '0', '2', '9', '0', '45', '1', '0', '0', '7', '0', '10', '16', '0', '5', '1', '1', '0', '1', '0', '0', '0', '0', '4', '0']
['April', '1', '0', '5', '0', '3', '12', '1', '11', '0', '3', '0', '3', '0', '0', '3', '2', '34', '0', '0', '1', '2', '0', '6', '18', '1', '3', '0', '0', '0', '0', '0', '0', '0', '0', '5', '1']
['May', '7', '0', '6', '0', '8', '4', '1', '13', '0', '0', '2', '2', '0', '1', '7', '1', '30', '0', '0', '0', '7', '0', '5', '12', '0', '4', '1', '0', '0', '0', '0', '0', '0', '0', '6', '1']
['June', '0', '1', '14', '0', '7', '15', '0', '17', '1', '2', '0', '5', '0', '1', '3', '0', '24', '0', '0', '0', '5', '0', '6', '13', '1', '9', '1', '1', '0', '0', '0', '0', '0', '0', '2', '1']
['July', '0', '1', '6', '0', '8', '17', '1', '15', '2', '1', '0', '10', '0', '2', '15', '2', '53', '0', '3', '3', '6', '0', '7', '16', '0', '9', '1', '1', '0', '0', '0', '0', '1', '0', '2', '0']
['August', '2', '0', '5', '0', '8', '15', '1', '17', '0', '2', '0', '2', '0', '5', '16', '0', '33', '0', '0', '0', '11', '0', '2', '25', '4', '8', '0', '0', '0', '1', '0', '0', '0', '0', '3', '0']
['September', '2', '0', '10', '0', '16', '22', '2', '19', '4', '2', '0', '0', '0', '2', '8', '0', '27', '0', '1', '0', '8', '0', '11', '31', '1', '9', '0', '0', '0', '1', '0', '0', '0', '1', '1', '0']
['October', '3', '1', '8', '0', '4', '28', '0', '15', '2', '1', '0', '1', '0', '1', '6', '0', '15', '0', '1', '0', '3', '0', '9', '26', '1', '8', '4', '0', '0', '0', '0', '0', '0', '0', '1', '0']
['November', '0', '3', '3', '0', '6', '23', '1', '8', '1', '2', '0', '1', '0', '3', '7', '1', '20', '0', '0', '0', '8', '0', '3', '18', '3', '7', '0', '0', '0', '0', '0', '0', '0', '0', '3', '0']
['December', '1', '0', '4', '0', '4', '13', '2', '15', '1', '0', '0', '2', '0', '1', '2', '0', '29', '0', '1', '0', '7', '0', '3', '20', '1', '13', '0', '1', '0', '0', '0', '0', '0', '0', '3', '0']


Example 4 - following sibling []
parsedDocument.xpath("//tr[td[@class='name'] ='Brand']/td[@class='desc']") = [<Element td at 0xda53c0>, <Element td at 0xda5390>]
parsedDocument.xpath("//tr[td[@class='name'] ='Brand']/td[@class='desc']")[0].text = Intel

>

Ways to split longer WordPress posts

Joe:

These are ways to make longer posts easier to digest for readers. They are ways to split longer posts to keep readers engaged.

Originally posted on The Daily Post:

We often think that our attention spans have grown shorter with the onslaught of digital media, but in fact longform writing — on WordPress.com and beyond — is alive and well. It’s sometimes challenging, however, to display longer pieces in a way that keeps your readers engaged.

If you’re looking for tips on presenting your latest longform creation, this post from last year, by Daily Post contributor Elizabeth, will introduce you to some nifty features built into your site. Whether you’re working on a meaty piece of prose for Blogging U.’s Writing 201 course, or just often have a lot to say, you should try these out.

Today, we’ll cover three features that can help you break up and organize longer posts, so that they display more cleanly and are easier for your readers to digest. We hope these tips come in handy!

Pagination

Longform posts are…

View original 697 more words

Easy to Understand web2py Grid Custom Search

web2py Grid Custom Search WITHOUT specifying a custom search_widget

The custom_search.html view contains the EASIER TO UNDERSTAND customization code. Here is the technique.

  1. Make the SQLFORM.grid’s Standard Search Input hidden.
  2. Define Custom Search Input elements with onchange events that send their values to the to the hidden Standard Search Input.
  3. Insert the Custom Search Input elements after the Standard Search Input (“#w2p_keywords”) using jQuery .insertAfter().
    • This prevents them from showing up on Edit or View pages.
    • Insert them in reverse order of them appearing on the page.

You can find an older version of this on web2pyslices.com

Here is the Controller code. Note the absence of a custom search_widget argument in the grid function call.

# in default.py Controller
def custom_search():
    '''
    Implements SQLFORM.grid custom search 
        WITHOUT specifying a custom search_widget,
            and so needing to read & understand the clever web2py implementation source code.
    The custom_search.html view contains the EASIER TO UNDERSTAND customization code.
    The technique:
        1. Make the grid's Standard Search Input hidden.
        2. Define Custom Search Input elements 
            with onchange events that 
                send their values to the to the hidden Standard Search Input.
    '''
    query=((db.contact.id > 0))
    fields = (db.contact.id, 
        db.contact.l_name, 
        db.contact.f_name, 
        db.contact.prime_phone,
        db.contact.date_modified,
        )

    headers = {'contact.id':   'ID',
           'contact.l_name': 'Last Name',
           'contact.f_name': 'First Name',
           'contact.prime_phone': 'Primary Phone',
           'contact.date_modified': 'Info Last Updated',
           }    
    init_sort_order=[db.contact.l_name]   

    grid = SQLFORM.grid(query=query, 
        fields=fields, 
        headers=headers, 
        orderby=init_sort_order,
        searchable=True,  
        user_signature=False, 
        create=True, deletable=False, editable=True, maxtextlength=100, paginate=25)

    return dict(grid=grid)    

Here is the View code.

<!-- In custom_search.html view -->
{{extend 'layout.html'}}
{{block head}}
{{super}}
<script>

function phoneSrch(){
    var srch ='contact.prime_phone contains '+'"'+jQuery('#joephone').val()+'"';
    $("#w2p_keywords").val(srch);
}
function lnameSrch(){
    var srch ='contact.l_name starts with '+'"'+jQuery('#joelname').val()+'"';
    $("#w2p_keywords").val(srch);
}

$(document).ready(function(){
  // Make the Grid Standard Search Input hidden  
  $("#w2p_keywords").prop("type", "hidden");   

  // Insert the Custom Search Input elements after 
  //     the Standard Search Input ("#w2p_keywords")
  //     using jQuery .insertAfter().
  //     This prevents them from showing up on Edit or View pages.
  //     Insert them in reverse order of them appearing on the page.
  var input2Str  = '<div class="joeinputclass" style="padding-bottom:10px;" >';
  input2Str += '<span class="joelabelclass" >Primary Phone contains: ';
  input2Str += '</span><input name="joephone" id="joephone" type="text" ';
  input2Str += 'onchange="phoneSrch()" style="width:150px;" ><br/></div>';
  $(input2Str).insertAfter("#w2p_keywords");
  var input1Str  = '<div class="joeinputclass" style="padding-bottom:10px;">';
  input1Str += '<span class="joelabelclass" style="padding-right:18px;" >';
  input1Str += 'Last Name starts with: </span><input name="joelname" ';
  input1Str += 'id="joelname" type="text"  onchange="lnameSrch()" ';
  input1Str += 'style="width:150px;" ></div>';
  $(input1Str).insertAfter("#w2p_keywords");
});

</script>
{{end}}
<h2>Contacts</h2>
<div id="theweb2pygrid">
{{=grid}}
</div>