Try the codes or catch the exceptions

Saturday, April 11, 2015

Retrieving Google account owner and contacts via OAuth2

All the inviter solutions for web apps that I have found on internet only provide the ability to import contacts. But how does the person who is receiving the invitation will know the sender's identity? Of course, if the contact import solution is an integrated module of an application A, then you can use the user info within that application A to identify the sender. However, this requires the user to get an account within that application A, which reduces the range of people who can send non-anonymous invitations to use the app A. So I thought about a modification to fix that concern by allowing the public to send such invites. This is done by retrieving the full name and email of the Gmail/Google user whose contacts are imported via the OAuth2 protocol and thus suppressing the need to create an account within app A to send a non-incognito invites to use app A. The code is a modified version of the 25Labs' sample source.

But first, here is an illustration to quickly show how OAuth2 works.

Knowing the format of the Google API response, I simply store the response in a temporary XML file that I later parse to extract the name and email of the user who is sending the invite.
To reduce the probability of file conflict in the storage of the temporary XML file in case of multiple simultaneous imports, I add a unique prefix that I call "salt" and which I define as the MD5 hash of a unique string ; that string is the concatenation of the current Unix timestamp and IP of the user.

A slight modification for a big change. You can get the full code here. I expect to do the same for the other email services that provide an API.
Hope this helps some developers and engineers out there.

References:
- 25Labs.com
- Developers. google.com

Friday, January 16, 2015

Beware of security questions

Almost all the major companies (insurance, health, banks, hospitals, ...) which hold prime confidential information about us might allow people to steal that information instead of protecting it.
I guess most of you are familiar with their security questions. So here is a small story. Yesterday I was registering for an online service and then came to the stage where I had to provide answers to the security questions. Looking at the questions, I was like, "man, how come you feed me with the same questions set as my bank". As an information security practitioner, I fell the need of second-guessing this security feature and thus, launched a quick investigation.
To do this, I simulated accounts, each in 2 different types of companies X and Y. I won't name them to avoid any reputation impact. Here are the security questions they provided:

With a simple visual observation, one can see that X's questions are about 30% similar to Ys.Therefore, if an attacker knows the answers to access your data in company X, then the probability that s/he can access your data in company Y is 0.3. That means if the questions you choose for X and Y are in the colored similarity zone of the picture above, then you are in a big mess.
We are witnessing here a trade-off between "user responsibility" and "security hardness". Unfortunately, this is a compromise that will most likely show a shift in favor of the user responsibility because most companies don't spend much resources checking for all others companies' security questions before setting up their own; they usually take the most common security questions and tweak them a bit. And we all know that when it comes to computer systems, users like straight and easy requests. So, folks, think twice when choosing your security questions. If picking similar questions for all your accounts is a risk that you are ready to live with, then go for it, otherwise diversity in questions is highly recommended.
Another danger that came to my mind was the answer to the question. One big problem in security is adding more security layers while keeping a good user experience. In the design of the security questions, it looks like companies opt for the user experience by asking simple questions. And of course, the price to pay is low security; meaning that anyone can guess the answer if they know you well enough or use social engineering. And because most of us, users, trust computer systems (especially big fishes), we usually give the right answer to the security question instead of fooling the inference power of a possible attacker by providing a fake answer. Lying to a human being is a bad thing, but lying to a computer system is not. A word to the wise is enough.

Wednesday, May 21, 2014

Debug better, faster and make your code stronger

Anyone can code. But when it comes to building a dynamic web or software system, especially from scratch, a tremendous amount of time can be required depending on the chosen approach, the available resources, the design complexity it is given and mainly, your troubleshooting dexterity. Whether the page of code to fix is a stand-alone or depends on another page (one-to-one dependency), or on many other pages (one-to-many dependency) or vice-versa (many-to-one dependency), you definitely want to quickly find the line that is giving you headaches, makes you live in a major depression for days or prevents you from sleeping at night (Developers will recognize themselves in some of these scenarios). Here are 4 must-know reliable debugging medications that will refuel your faith in coding.

- Log files: Undoubtedly, the best clear-cut process to get informative data about a bug. When setup properly, they can help you record operations, system processes and errors that occur during the execution of a code. You can use the built-in log of the used IDE system or web server. For instance, in Apache, it is possible to configure the logs by editing your site's config file (In Linux, it is generally /etc/apache2/sites-available/mysite)
Furthermore, it is possible to write your own log method if you are not satisfied with the built-in one. An efficient way to do it is by asking yourself 3 questions: what message do I want to record (the content), what type of message to record (error, info, debug,..) and the context in which the message is recorded (the time, the location, the referrer,...). A full example of a log entry would look like this:
Error :: user [Joe Smith] data [<junk entry123>] didn't match rule 107 :: 5/8/2014 @ 2:46pm

- Breakpoints: If you have never heard of this one, you better start digging into it. These are temporary halters that are used to stop the program in specific locations so that the status of the program can be examined in stages. It is really useful in isolating the line that is running, and even the called variable, when the code goes south . In C#, the code can be broken via the System.Diagnostics.Debugger.Break() method; stop in VB, vardump()/die() in PHP. Their location in your code is up to you. However, it's a good practice to make the interpreter/compiler hit the breakpoint ONLY when you are debugging; otherwise, in production, it will continue to hit it and may abruptly reveal confidential data to users or break the program run.

- Stack traces extraction: Sometimes, you have situations where functions have to nestedly collaborate in order to have a well-implemented code i.e function a() calls function b() which calls function c() and so on and so forth. Such coding style is pretty much prevailing in complex systems and tracking a bug in this kind of vicious circle becomes a nightmare when you have hundreds of functions to scrutinize. To find the x-spot, it is more appropriate and time saving to have a broad view on the code execution; and the memory is the best place to have it. By enabling a run report on the memory stack frames, you will be able to detect that specific function where the error occurs, which considerably reduces your investigation domain. Some open-source extensions/plugins are available to do this memory forensics: Xdebug which is widely used by PHP developers and even has a plugin for Firefox. In C/C++, we have the powerful GDB that is also breakpoints-capable.

- Hard coding: This approach is mainly meant to variables definition issues and consists of implanting the input data directly in the source code. For instance, suppose you are working on integrating an API authentication system in your code and after a visual review of your code, everything looks fine : the API call, the syntax and the logic. Furthermore let's say the API call is used as follow:
function useAPI ($username, $password){
$message = loginAPI ($username, $password);
if($message == "good") {makeCall(); return "Ok"; } else return ""Bad password/login";
}

Now, when you run the code, things get unpropitious and a response like "Bad password/login" is returned to you despite the correctness of the passed credentials. Given the message, the blame is definitely not on the makeCall() function; thus, the first wise move would be to investigate on loginAPI() by hard coding the parameters sent to it. If the subsequent result is still the same, then the probability that the API URL used in loginAPI() is wrong is pretty high. If the result was different ("good") after the hard coding, then the bug would simply reside in the mistyping of the credentials initially sent to useAPI() during the earlier run.

As you may have already noticed, the techniques shown earlier are more of a dynamic analysis approach (i.e to find the bug you have to execute the code/program). If you want to use static analysis to find the bug, it might be a bit more onerous to achieve depending on the model, data flow and logic of the platform to fix.

Life school vs. School life

So what if the reason our kids are doing their schoolwork is not just for a grade, or to impress their mates or to make their relatives proud? When I was a kid, my dad always liked to repeat: "If you want to be a successful man in the future, you must be able to take advantage from the life of school and the school of life". Obviously, at that juvenile age, my lack of wisdom and the comfort zone of having family around prevented me from understanding the deepness of such statement.

As I grew up, facing the world and its daily challenges helped me understand it. Someone's work is more valuable when it impacts the person's himself and the community in which it lives and when that work is built on its own or with others. I am not talking about building the next app that will change the way we live or see things; it will not be cognitive to think that technology can solve everything. I am talking about those authentic, traditional, creative and magical acts that students can produce and learn from. Some instances include: Getting involved in clubs and associations, doing community service or volunteering, traveling the world, learning from their teachers' experience, taking part to school activities (competitions/games) or panel discussions.

Despite the effort of school in formalizing the understanding of life's aspects through numerous well-concise degrees, some skills are best learned when life is your teacher: emotional intelligence, open-mindedness, pressure and time management, collaboration, wisdom in choice making, relationship building, facts like: there is no shame in saying "i don't know", or you can't please everyone or failures are stepping stones to success when you learn from your mistakes.

What if, instead of having the evolution of students at school being evaluated via a bunch of multiples entries on a question sheet, we have their growth being evaluated on their knack in designing and sharing information for global communities to meet a variety of purposes using the theoretical concepts learnt in class.

What if school was more than defining students capabilities through corruptible grades, but through how they can impact the world or themselves, both in a positive way.

I don't know about you, but for me, I’d rather know that my kids are going to school learning something meaningful, valuable that will empower their identity or can create opportunities for them to live in a better world.

"Real work for real audience" - one said.

Sunday, December 29, 2013

Going safe, Secure forwarding

Requirements: MySQL back-end database, Apache and PHP 5.4+ installed and running

Inspired from the traffic control capability of Websense and AVG web security, my new coding partner, Pradeep, and I tried to provide you with a script doing an enhanced version of those tools' job. Indeed, the PHP script provides high quality security to outbound traffic, by warning the user of the reputation of the outside website and the potential dangers (if any) s/he is exposed to by visiting this last.
It has been designed to protect a local user of a web application from having confidential data (cookies, credit card secret code,...) being stolen via known hacking techniques like XSS.

1) How it works: The algorithm

Consider the following scenario:

Suppose all the unknown links on the page P1 are on the form :
<a href="p2.php?link=urlencode(xyz)">xyz</a>

So here is the scenario.
a) User is actually browsing on P1. S/he sees a link xyz and wants to visit that link without knowing the risk(s) s/he's exposing himself/herself to
b) When s/he clicks on it, s/he is sent to P2 to check that link first before being forwarded [Step 1 on the figure]
c) P2 open a session (you can make it secure with ssl if you want) with the BAP and check the purity/cleanness of xyz [Step2 on the figure].The BAP can be anything: a database, an encrypted file or a customized server. It's up to you
Basically, The BAP stores a list of links that have bad reputations over internet due to their content (malware, botnets, pornography,..).
d) Once the search is completed in the BAP, the BAP returns a token to P2 whose value determines if that link xyz is clean or not.Then BAP also closes the session [Step 3 on the figure]
e) If the link has been classified as "dirty" with respect to the returned token, then P2 alerts the user on P1 that s/he's heading to a bad website xyz
[Step 4 on the figure]
{
If the user acknowledges the risk and still wants to be forwarded, P2 forwards the user at his/her own risks and records the incidence in a log file [Step 5 on the figure]
else P2 returns to P1
}
otherwise the if the link has been classified as "clean" with respect to the returned token, then P2 simply forwards the user to xyz
[Step 5 on the figure]

2) Want to test it?

Download source

I know I was supposed to write about network design and configuration as promised in the last week's sneak peek. Some last minute unforeseen issues came through but don't worry, I will post about it asap in 2014. Meantime enjoy the holidays and Happy New Year from the #CodijuanaTeam!

Sunday, November 17, 2013

Network X Web Servers: What if you could setup everything from scratch?

You're a dummy programmer who's always been dealing with pre-configured hosting environment like CPanel et al. But this time, you want to take it to an upper level of control on the hosting, your own servers, your own configuration, your own security. Well, the following sequence of articles is for you. I will cut the chase later

but for now, you have to understand the network matrix which can be summarized in the following topics: how internet works, OSI model, TCP/IP, IP addressing/subnetting, Routers, Firewalls and Quality of Service.

1) Digesting the network matrix in a few lines

In order to easily get what follows, you have to keep in mind that network communication is pretty much similar to human communication, but in more conventional and adaptive manner. In any friendly talk (network), when you (device 1) talk(sending request) to another person (device 2) using your voice (data), you expect the person to reply (response via data). If that person doesn't want to hear what you're saying, he just has to clog his ears (firewall). Now, in bigger and complex conversations (internet) such as debates, conferences,.. there is a need of a moderator (router) to arbiter the communication between groups of people.

Now, let's go deeper by explaining the reverse: How machines communicate in a network. But before, I have to introduce you to a model called the Open Systems Interconnection (OSI) model that was built by the International Organization of Standardization (aka ISO) in 1984 to facilitate the understanding of this type of communication. You might ask yourself how? Well, let me fly over a sweet example that will enlighten you.

The OSI model simply formalizes the communication between machines over a network by splitting the communication flow into layers. The following conversation is a client-to-server PHP webpage request - http://www.example.com/index.php

Client		Server (example.com)
Application layer (L7): The client requests a web page by typing in a web address (a URL) in the application (browser)		Application layer (L7): The web server application (PHP from the LAMP* stack) processes the data and produce an (HTTP) response(4)
Presentation layer (L6): The request is formatted into HTTP generic request message (3)		Presentation layer (L6): The request is formatted into HTTP generic response message (3)
Session layer (L5): The browser initiates an (HTTP) session by opening a TCP connection to the (HTTP) server with which it wishes to communicate. Your web browser automatically opens additional TCP connections to the server and request those media	-------------------------------- >>> <<< --------------------------------	Session layer (L5): The server sends back the response / web page and closes the connection. \| \| Your web browser then parses the HTML of the web page to read the instructions (HTML tags) which tell the browser where to find additional files to be displayed within the web page such as style sheets, images,....
Transport layer (L4): The opened TCP connection(s) breaks up the request message(s) into managable chunks, labels them with numbers so they can be reassembled in the correct order and transport the pieces across the correct session	<<< ----------------------------- >>>	Transport layer (L4): reorder the data stream if the incoming packets are out of order, multiplexing data in case of different flows to applications on the same hosts
Network layer (L3): Internet Protocol (IP) provides unique addresses for the web server and for your computer, then creates a message addressed to the web server and to be sent via your default gateway	<<< ----------------------------- >>>	Network layer (L3): read each received packet to check if any matches any access restriction or filtering feature set at the computer firewall
Data layer (L2): Your computer uses ARP to figure out the physical MAC address of the default gateway and then passes the messages to the network card. Once there, each chunked message is transformed into a network request/packet then forwarded to the Internet via the detected gateway		Data layer (L2): The network card reads the received packets and converts them into packets readable by your machine
Physical layer (L1): transmits the packets to the default gateway.	--------------------------------------------	Physical layer (L1): receives the transmitted packets and send to the network card

*LAMP: Linux-Apache-MySQL-PHP (5)

Such modeled separation can be applied to any type of computers' communication to understand or study this last; the only difficult part is to put the right element or process of the communication in the right layer.

However, the example above is an easy one and is only valid for a small number of machines. If you want a huge multiple computers' conversation like internet, things will get jammed and your network will look like a loopy bin.

Figure 1: Traditional devices communication

That's why, you will need a moderator that will work at the layer 3 (TCP/IP) to deal with the routing of information using the computers' IP. Such moderator is called a router. And what it does, is that it divides the multiple computers' conversation into sub-conversations. After incorporating routers, your network will look like this:

Figure 2: Modern devices communication

This schema (Figure 2), when projected at a global scale, is basically called internet.

Each router has a routing table to compute the next hop for a packet. The routing table stores the routes (and in some cases, metrics associated with those routes) to particular network destinations. Your routing table is created automatically, based on the current TCP/IP configuration of your Linux / UNIX computer. You can manually add / modify / edit routing table using route and ip command on Linux (6). In addition, some routers also provide security via NAT/DNAT, additional firewalls (to not be confused with the ones of your computer) and Quality of Service as we'll see in our future articles while setting up our own environment.

Now you know that if you don't have valid destination IP address to match in the routing table, you might have trouble in your conversation (this is not new since it's the same scenario in human conversation :) ).

But:

- If the communication is only from IP to IP, how does my machine communicate via an URL in the browser?

-How does it work inside each sub-network monitored by a router?

-How about the interfacing between your computer and the router itself?

Ok ok. One question at a time.

The communication over internet is indeed IP-to-IP. Your machine has no clue of what an URL is. At L7 (see the above OSI-modeled communication example), when the user provides the URL, your computer contacts your DNS servers to convert the supplied URL into an IP address, which is later interpreted by your device as a sequence of octets through a binary division. The following figure illustrated this translation process:

Browser

http://www.example.com

DNS

125

228

Device

01001010

01111101

11100001

00010001

Figure 3: Interpretation sequence of a URL by a device within a network

For Linux-users, you can find the IP addresses of your DNS servers in the file "/etc/resolv.conf". Those DNS servers' IPs are automatically configured by the router of the sub-network to which you belong. to

Now talking about sub-networks, two things to keep in mind are: private IP and public IP. At the early beginning, everybody used public IPs to communicate. After the arrival of routers, for security reasons and to prevent an eventual shortage of public IPs, private IPs had been integrated for internal communication within sub-networks. And then, when a request, sent from your machine at L7, reached L3, your router would convert your private IP into public IP to be attached to each packet header. However, no matter the number of private IPs (hosts) in your sub-network, the public IP is the same.

The private IP is automatically assigned to the network card's interface of your device by the router of the sub-network to which you are connected and must follow the standards set by RFC 1918. Here is the range assigned private IPs depending on the class of your network.

IP address range	Number of addresses	Class of the network
10.0.0.0 - 10.255.255.255	16,777,216	class A
172.16.0.0 - 172.31.255.255	1,048,576	class B
192.168.0.0 - 192.168.255.255	65,536	class C

And yes, two machines from two (2) different sub-networks can have the same private IP; collision of private IPs is allowed as long as the machines belong to different networks.

Moreover, the structure of the private IP address reveals 2(two) major hidden lines: an identifier of your sub-network and an identifier for your machine. These lasts can be easily unmasked by reading another sequence called the subnet mask. For example, if the private IP of my wireless card's interface(wlan0) is 192.168.1.1 and my subnet mask is 255.255.255.0. We simply use an alignment reading technique to figure it out.

We stick both sequences and divide from the 0-side of the subnet mask

192.168.1.
255.255.255.

1
0

The red container is the network identifier and the green one is the identifier of the device on the network. So now, you can check yours by typing in the command line, ifconfig for Linux and ipconfig for Windows.

Another use of the subnet mask is in the design of network. For instance, let's say, after buying your router, you connect your laptop, and type ifconfig and notice your private IP is 192.168.1.0. Then, you speak to yourself and say: " I want to have a network/sub-network with 254 hosts". Using that amount, we can determine the subnet mask for your future network and thus the splitting above.

Remember (from Table 1) that an IP address is nothing more than a sequence of 4 bytes, that is 4 sequences of 8 bits.

_ _ _ _ _ _ _ _

There's a formula to use to determine depending on the starting point: the willing number D of devices OR the willing number N of subnets/networks.

D = 2ⁿ- n, N = 2ⁿ

where n is the number of bits to skip before splitting

For D, we count from the right and for N, we count from left

So for the network we want to build, it will be: 2ⁿ- n = 254 => n = 8

Hence our subnet will be:

11111111	11111111	11111111	_ _ _ _ _ _ _ _
255	255	255	0

Knowing that our subnet mask will be 255.255.255.0, we can now use this information with the targeted amount of hosts to allocate private IP addresses and their corresponding lease time to the network's devices via the DHCP This process is the manual allocation of IP addresses. Despite its effectiveness, this method is only suitable for company-size network (for example, when allocating an IP address to a company's FTP server). For minor cases like home networks, the IP allocation is usually done automatically by the router.

And because each sub-network holds a private conversation and the router is the one in charge converting your insider(private)IP into a public one X , the inside interface of the router must have a private IP; meantime its outside interface has the public IP address X.

2) Subsequent themes of this series
Now, you're ready to move forward. Please note that the implementations will be entirely done in a Linux environment.
- A1: Setting up an all-level ( From the modem to the DMZ via the router) network (virtual and physical)
- A2: Designing and implementing firewalls and Quality-of-Service for your network
- A3: Configuring and securing the servers for your website - Case study: LAMP
(If you are a web developer)

3) Advanced work

A similar sequential approach can be used depending on the service provided by the network you are developing. In our case, it's a hosting network. But you can have a network configuration dedicated to supply cloud services (E.g: IaaS platforms like Amazon EC2), Storage services (E.g: Dropbox), Distributed delivery services (E.g: CDNs)...etc

4) References
1- Designing and Implementing firewalls and QoS with Linux using netfilter, iproute2, NAT and l7-filter, 2006, Lucian Gheorghe
2- http://www.inetdaemon.com/tutorials/basic_concepts/network_models/osi_model/OSI_model_real_world_example.shtml
3- http://www.tcpipguide.com/free/t_HTTPRequestMessageFormat.htm
4- http://www.tcpipguide.com/free/t_HTTPResponseMessageFormat.htm#Figure_318
5- http://en.wikipedia.org/wiki/LAMP_%28software_bundle%29
6- http://www.cyberciti.biz/faq/what-is-a-routing-table/
7- http://en.wikipedia.org/wiki/Private_network
8- http://www.tarunz.org/~vassilii/TAU/protocols/dhcp/ipaddr.htm

Thanks for your time and don't forget to comment and share this article, add me on G+ and subscribe to the blog's RSS feeds

Sunday, November 10, 2013

Case #3: Private trusted identity (RFC 3161)

Requirements: openSSL (v1.0.0+), PHP and Apache installed and running - valid allowance from a TimeStamping Authority (aka TSA), Exec and Curl extensions enabled in your PHP environment.
Now, imagine you have personal files, group of files, secret contracts, explicit pictures, even HTTP headers and many more to which you can give a perfect secrecy or files that you can securely share, all of this with only one keyword: an identity.
For instance, there is an exclusive movie coming out next week and you are the only one within the torrent social network KickAssTorrent who was able to have it. However you want to share it with your peers but at the same time, don't want to be caught by the media authorities. [ Yeah, I know ; as a security guy I'm suppose to give delightful examples, but let's be a bit silly for this week, shall we? ]. So you have to protect yourself by protecting the access to the torrent and your identity (keep it private). This is done by giving access to the torrent to only trusted peers.
Such achievement resides in the innovative cradle that we'll call the "Private Trusted Identity (ID for short)" and is brought to you by Codijuana, trademark of Landry Kouajiep.

(1) How does it works?

While publishing the torrent of the brand-new movie, your identity (username or id in the database ...) is encrypted using the alphapad permutation cipher. Then, the torrent T is saved and a list of trusted users for this specific file T is created. Next, both the ciphered id and hash of T are pushed in an array that is serialized in a string which is also hashed.The resulting hash is used to request a RFC3161 timestamp from a TSA via openssl and curl. The TSA's response is collected and the timestamp is extracted. At the end, we have: the torrent file T, the list of trusted users (i.e people who can access T without validation), the timestamp and the TSA response.

Now if a peer wants to access that torrent (the torrent itself and the identity of its owner), he must provide his identity, the encrypted id of the torrent's owner and the hash of that torrent. Unfortunately for the media authorities or the attackers, the last 2 cited credentials are only accessible to peers that are trusted by the torrent's owner :P .

The theory behind this reasoning is the following: If you can prove you knew a file's hash and the encrypted id of its owner at a certain point in time, that guarantees you knew the file content and its owner.

(2) How is it different from the others?
I guess some of you will call it a "deja vu". Well, let me give 3 reasons that'll wipe that impression from your mind:

Better than the usual digital signature: Apart from verification as is the main purpose of digital signature, this method is an authentication-authorization system. For instance, instead of only signing a Git commit on GitHub, we would also be able to control and monitor the access to that commit using only openssl and curl; no more exhaustful tracking codes; only certificates and the lightweight alphapad permutation cipher. Total empowerment of commits' security.

Deeper than passwords: It not only allows you to protect data layers (files, accounts,..) but can also be used to keep the communication layers safe (http header compression for instance)

More powerful than SSL: It has recently been shown that TLS can be broken in less than a minute using the BREACH attack which abuses the difference of size in the response bodies to infer the original response. The embedded alpha-padding permutation cipher takes advantage of its OTP's characteristics to provide the perfect secrecy and the control on the variation of the response bodies' sizes (inherited from the control on the padding parameter) which seriously gives more control on the variation of the BREACH's chances / probabilities of breaking the cipher.

(3) Enough talking. Can you show us the code and explain it?
Sure. I have to precise that the illustrated code is divided into two main sections respecting the above introduction to the concept. The first section covers "WHAT HAPPENS WHEN THE OWNER SAVES HIS DATA/FILE" while the subsequent takes in charge "WHAT HAPPENS WHEN ANOTHER USER TRIES TO ACCESS THE DATA/FILE". Also the data in the first lines of codes of each section are only examples and can be changed at your own taste. However, for less trouble in your experiments, I recommend you the following:
- For an identity (the variables $my_id in section 3.1 and $provided_id in section 3.2) you should use characters from the alphabet you defined in the alphapadding permutation cipher. More details about this alphabet here.
- For the hash in section 3.2, please make sure it's 40 characters (letters and digits) long since I'm using sha1 here.You can change the hash algorithm too. But make sure to adjust the request to the TSA to match this change.

3.1 WHAT HAPPENS WHEN THE OWNER SAVES HIS DATA/FILE

<?php

include_once "class.alphapad.php";
//- (1) We 1st set user data and the system
$system_pad = 300;
$system_key = 90;
$permutation = new alphaPad($system_key,$system_pad);
$my_id = "john smith"; //- The owner id provided by the system
$my_id = $permutation->_encrypt($my_id); //- set private by encryption

$data = $my_id."_movie_name_".time();
$saved_file = $my_id."/".$data.".torrent"; //- the original document to protect
chmod($saved_file,600); //- to make sure nobody except the owner can access it

$trusted_ids = $my_id."/".$my_id."-list.txt"; //- and create a list of trusted people
$cmd = "grep '".$my_id."' ".$trusted_ids." > result-".$my_id.".out"; exec($cmd);
if (filesize("result-".$my_id.".out") == 0) file_put_contents($trusted_ids, $my_id."\n"); //- starting by the owner himself
chmod($saved_file,600);

// Another way to build a trusted list is by using suPHP to chown files from here to the appropriate user/group and add trusted ids to the owner's group, but the effect will be that the web server no longer can read/edit those files once you change ownership of them.

$my_hash = sha1($data);
$data_string = serialize(array("hash" => $my_hash,"alpha_pad_id" => $my_id)); $fhash = sha1($data_string);

//- (2) Then request a timestamp using OpenSSL and capture the response
// The request is HTTP POST binary data in the header that is passed to the server using curl. You'll need to have a bit of info for this:
// URL - An RFC3161 time-stamping service (DFN-Verein in our case)
// FHASH - The final hash (set up earlier) you want a timestamp for. Must be a full hash.
$cmd0 = '
CONTENT_TYPE="Content-Type: application/timestamp-query"
ACCEPT_TYPE="Accept: application/timestamp-reply"
URL = "http://zeitstempel.dfn.de" ';
exec($cmd0);

$cmd1 = 'openssl ts -query -cert -digest '.escapeshellarg($fhash).' -sha1 \
    | curl -s -S -H "$CONTENT_TYPE" -H "$ACCEPT_TYPE" --data-binary @- "$URL" -o response.tsr ';

//- What $cmd1 does is the following: we create a time stamp request
//- The output is a time stamp request that contains the SHA1 hash value of your data; ready to be sent to the TSA (DFN-Verein in our case - see the bottom of the article for more TSA).
//- Then after the "|" in the command,i.e once the request is ready, the curl program transmits your request to the DFN-Verein servers.
//- If an error occurs then the output file will contain information to help debug (see the parameter -S in the command).
//- Otherwise the output file (.tsr file) is the RFC3161 timestamp response of your file is returned

exec($cmd1." 2>&1", $array, $status);
if ($status !== 0) throw new Exception("OpenSSL does not seem to be installed: ".implode(", ", $array));
if (stripos($array[0], "openssl:Error") !== false) throw new Exception("There was an error with OpenSSL. Is version >= 0.99 installed?: ".implode(", ", $retarray));

//- (3) We now verify the response by extracting the timestamp and if valid, we save that response string/token

$cmd2 = "openssl ts -reply -in response.tsr -text";
$timestamp_response_array = execute_reply($cmd2, $my_id."_".$my_hash);
$matches = array();
$response_time = 0;
foreach ($timestamp_response_array as $retline){
if (preg_match("~^Time\sstamp\:\s(.*)~", $retline, $matches)){
            $response_time = strtotime($matches[1]);
            break;
}
}
//- For a better understanding of this extraction, please take a look at the post following this section

if (!$response_time)throw new Exception("The Timestamp was not found");
echo "File and identity saved, safe and sound! \n
           Here are the credentials for your trusted peers:\n
           hashID:".$my_id."\n
           hashTorrent: ".$my_hash;

function execute_reply($command, $storage_name) {
   $retarray = array();
   exec($command." 2>&1", $retarray, $retcode);
   // 2>&1 appendix redirects the error-stream STDERR to STDOUT
   // all the STDOUTed lines are then sequentially pushed in the returned array $retarray
   // $retcode contains the return status of the executed command

   if ($retcode !== 0) throw new Exception("The reply failed: ".implode(", ", $retarray));
    else {
       //- We gather the response token in a file for future authentications
       $tmpfname = tempnam("/responses", $storage_name); //- tempnam will chmod the file to 600 i.e unalterable except by the owner of the file
       $save_cmd = "echo '".$command."' > ".$tmpfname."";
       unlink("request.tsq"); unlink ("response.tsr");
   }
    return $retarray;
}

During the request for a timestamp via the execute_reply(), the STDOUTed lines are the content chunks of the timestamp response that look like this:

    Using configuration from C:\OpenSSL-Win64\bin\openssl.cfg
    Status info:
    Status: Granted.
    Status description: unspecified
    Failure info: unspecified
    TST info:
    Version: 1
    Policy OID: 1.3.6.1.4.1.4146.2.2
    Hash Algorithm: sha256
    Message data:
        0000 – 58 df 63 8c 5b bf ff ca-ad 13 c9 6e 93 96 cd 25   X.c.[......n...%
        0010 - 66 5e f1 eb ba 8e 7f 74-6d 65 04 3c 5d ea e4 35   f^.....tme.<]..5
    Serial number: 0x2487F5EA8A5A085844ED68A8A7426E07E692E1BD
    Time stamp: Sep 17 05:08:38 2013 GMT
    Accuracy: unspecified
    Ordering: no
    Nonce: unspecified
    TSA: DirName:/C=SG/O=GMO GlobalSign Pte Ltd/CN=GlobalSign TSA for Standard – G1
    Extensions:

In other words, the returned array $timestamp_response_array is simply the above splitted by line. The hash that response was requested for is : 58df638c5bbfffcaad13c96e9396cd 25665ef1ebba8e7f746d65043c5deae435 (see the line starting with: "Message data:..."), the used hash algorithm is sha256 and the TSA is GlobalSign.
Also for simplicity, we assume that the TSA servers that we're using have an availability of 99.9%. For a deeper consideration, you might want to consider the failover scenario and schedule a waiting list of alternate servers.
/*
* Summary:
* INPUT: data T, private identity
* OUTPUT: the timestamp for T, a trusted list and the response token from the TSA
*/

3.2 WHAT HAPPENS WHEN ANOTHER USER TRIES TO ACCESS THE DATA/FILE

Before crawling the validation procedure, let's have a brief chat about the certificate chain of an TS authority (let's called it A).

* An authority A's certificate chain is a file that is used to trace back to an offline cryptographic appliance called the root certificate authority (root CA) where all the private keys are kept in a safe and physically secured facility that meets the security, policy and operational practices that all CAs are mandated to meet. By installing such chain in your browser, your browser can tell if the certificate of A was issued by a root CA or not . The time-stamping service A should be signing your timestamp with a certificate that was issued by a root CA. If not, your timestamp doesn't have much credibility (d). Its signature, at the bottom of its response, (For instance, the line starting by "TSA: DirName:/C=SG/...." in the example bove) indicates that the root CA believes that practices of the authority A below it meets that same high bar.
To better understand the relationship between the root CAs and the intermediate authorities like A, I suggest you to take a look at the trust tree from the ICSI certificate notary.

How does a certificate chain look like?A certificate chain is a usually a certificate of extension .pem or a text file (.txt) and looks like this (courtesy of DFN-Verein): https://pki.pca.dfn.de/global-services-ca/pub/cacert/chain.txt

A certificate chain can be easily created using curl or provided to you by your TSA.
You can also download this bigger certificate chain (with more root CAs): http://curl.haxx.se/ca/cacert.pem

WHY AM I TALKING ABOUT AN AUTHORITY CERTIFICATE CHAIN? Because it's an important piece of the authentication procedure as we will se below.

<?php

include_once "class.alphapad.php";
$system_pad = 300;
$system_key = 90;
$permutation = new alphaPad($system_key,$system_pad);

$user_id = "carla joe"; //- user who is trying to access file T
$user_id = $permutation->_encrypt($user_id);

$provided_id = 12232323434439999; //- should be the same as $my_id for the authentication to work
$provided_hash = "e47b8eef9eb2bedca76dcdd4041d3e1755e2324a"; //- should be the same as $my_hash for the authentication to work

$user_string = serialize(array("hash" => $provided_hash,"alpha_pad_id" => $provided_id)); $user_hash = sha1($user_string);

$response_token = $provided_id."_".$provided_hash;
$file_timestamp = $response_time; //- we suppose it has been saved in the system after the process in the precedent section
$tsa_cert_chain_file = "chain.txt"; //- the certificate chain from the TSA of our system
$trusted_ids = $provided_id."/".$provided_id."-list.txt";

if (file_exists($trusted_ids)){
    $cmd = "grep '".$user_id."' ".$trusted_ids." > result-".$provided_id."-".$user_id.".out";
    exec($cmd);

    if (filesize("result-".$provided_id."-".$user_id.".out") == 0) {
       $validated = validate($user_hash, $response_token, $file_timestamp, $tsa_cert_chain_file, $user_id);
       if ($validated == 1) {
            file_put_contents($trusted_ids, $user_id."\n");
            echo "You're in!";
        }
        else throw new Exception( "Hash mismatch! Please try again");
    }
    else echo "You're in!";
}
else {
    error_log_type(1, $user_id);
     throw new Exception("The file you are trying to access doesn't exist");
}

function validate ($hash, $response_token, $response_time, $tsa_cert_file, $user) {

    if (strlen($hash) !== 40) {
        error_log_type(1, $user, "Provided file's hash");
        throw new Exception("Invalid Hash");
    }
    $response_file = './responses/'.$response_token;
    if (!file_exists($response_file)) {
        error_log_type(1, $user, "Name of the response token");
        throw new Exception("Invalid Hash or owner ID");
    }
    if (!intval($response_time)){
        error_log_type(1, $user, "Provided timestamp");
        throw new Exception("There is no valid response-time given");
    }
    if (!file_exists($tsa_cert_file)) {
        error_log_type(1, $user, "Path to the certiificate chain");
        throw new Exception("The TSA-Certificate could not be found");
    }

$cmd = "openssl ts -verify -digest ".escapeshellarg($hash)." -in ".escapeshellarg($response_file)." -CAfile ".escapeshellarg($tsa_cert_file);

$array = array();
exec($cmd." 2>&1", $array, $status);
    /*
     * Just 2 valid cases:
     * 1) Everything okay -> status 0 + array[0] == "Verification: OK"
     * 2) Hash is wrong -> status 1 + strpos(array[somewhere], "message imprint mismatch") !== false
     * Every other case (Certificate not found / invalid / openssl is not installed / ts command not known) are being handled the same way -> retcode 1 + any retarray NOT containing "message imprint mismatch"
     */

    if ($status === 0 && strtolower(trim($array[0])) == "verification: ok")
       return 1;

    foreach ($array as $rline)
    {
        if (stripos($line, "message imprint mismatch") !== false)
           {error_log_type(2, $user, "Provided file's hash"); return 0;}
    }

    error_log_type(3, $user, implode(", ", $array));
    throw new Exception("System command failed: ".implode(", ", $array));
}

function error_log_type($severity, $user, $parameter){
    date_default_timezone_set('UTC');
    $log_file = "./error.log";
    switch ($severity){
        case(1): $error = "Wrong information provided";break;
        case(2): $error = "Hash's print match failed";break;
        case(3): $error = "Fatal error - System command failed";break;
    }
    $error = "[ ".date('l jS \of F Y h:i:s A')." ] - ".$error." - [ Author: ".$user.", Parameter: ".$parameter."] \n";
    file_put_contents($log_file, $error);
}

/*
* Summary:
* INPUT: user's private identity, file T owner's private identity and the hash of the file T
* OUTPUT: Access to file F containing T or not (reporting the user's trials)
*/

(4) Important notes
[1] Before running the code, you MUST have the allowance from the TSA (DFN in our case) to use their service. More about it here.

Download source

[2] Due to the presence of the subpadding in the alphapadding permutation cipher, the encrypted identity is subjected to identity collision in a case of a multiple-users system. A solution is, of course, to remove the subpadding substraction from the encryption, but this will increase the inference power of an attacker as explained in the "Extended Work" section of the precedent article.

(5) What are the reference hotlines?

References:
(a) https://www.digistamp.com/technical/software-alternatives/using-openssl-to-request-timestamps/?q=1
(b) http://www.d-mueller.de/blog/dealing-with-trusted-timestamps-in-php-rfc-3161/
(c) http://stackoverflow.com/questions/11913228/how-can-i-use-rfc3161-trusted-timestamps-to-prove-the-age-of-commits-in-my-git
(d) http://unmitigatedrisk.com/?p=395 | http://unmitigatedrisk.com/?p=397
(e) http://notary.icsi.berkeley.edu/trust-tree/

(f) http://searchsecurity.techtarget.com/definition/digital-signature

(g) http://keyserver.trustedtimestamping.com:11371/
(h) http://breachattack.com/

More TSAs:
* GlobalSign (https://www.globalsign.com/timestamp-service/)
* DigiStamp (https://www.digistamp.com/)
* DFN-Verein (https://www.dfn.de/en/‎)
* All CA roots: http://notary.icsi.berkeley.edu/trust-tree/

Thanks for your time and don't forget to comment and share this article, add me on G+ and subscribe to the blog's RSS feeds