web development – Grey Panthers Savannah https://grey-panther.net Just another WordPress site Sun, 15 Mar 2009 19:10:00 +0000 en-US hourly 1 https://wordpress.org/?v=6.9 206299117 Walking with objects https://grey-panther.net/2009/03/walking-with-objects.html https://grey-panther.net/2009/03/walking-with-objects.html#respond Sun, 15 Mar 2009 19:10:00 +0000 https://grey-panther.net/?p=357 1956960726_d4ab34f95f_oSome time ago I’ve read David Wheeler’s blogpost about using the OBJECT tag to embed HTML in your HTML :-). One of the things which peaked my interest was the question: what are the security implications of using this method? Specifically I was interested if the same cross-domain / same-policy rules applied to interaction between the parent and child as in the case of IFRAMES.

The response? Partially. The specifics are:

  • I tested it with IE8, FF 3 and Opera 9. IE8 doesn’t appear to render the referenced HTML, while the two browsers do (I’m not sure if this is a security feature or a bug :-)).
  • Neither of the two browsers (FF or Opera) seems to give access to the child from the parent, regardless if the child is loaded from the same domain or not
  • Both browsers allow the child to access the parent (again, regardless if you include it from the same domain or not)

What does this mean for security?

  • You can use this in situations where you can’t inject a script directly, but you can control the URL which is included in the object tag, which effectively is equivalent with an XSS attack. The scenario is less likely than having a direct XSS attack, but still feasible
  • You can attack the embedded site by providing HTML elements in the parent which scripts in the child are looking for (for example, if you say “document.body” in the child, it will actually mean the body of the parent). Again, to exploit such a situation, you need to have fairly specific circumstances, but it is not impossible.

This again shows (as if it needs showing…) how complex web security is and how many features there are which can interact in various ways, making it hard to foresee all the possible combinations and their particular (security) implication.

Picture taken from Laurel Fan’s photostream with permission.

]]>
https://grey-panther.net/2009/03/walking-with-objects.html/feed 0 357
Daily funny https://grey-panther.net/2008/11/daily-funny-3.html https://grey-panther.net/2008/11/daily-funny-3.html#comments Thu, 13 Nov 2008 06:05:00 +0000 https://grey-panther.net/?p=593 Via Mechanix: the difference between JPEG and PNG illustrated. A similar topic would be: don’t use the same image for the thumbnail and the big image! Just because you said width=”320px”, it will still needs to download the whole image!

]]>
https://grey-panther.net/2008/11/daily-funny-3.html/feed 2 593
A childhood memory https://grey-panther.net/2007/01/a-childhood-memory.html https://grey-panther.net/2007/01/a-childhood-memory.html#respond Mon, 15 Jan 2007 07:16:00 +0000 https://grey-panther.net/?p=922 Something very interesting happened: I remembered that I used to watch (and love) an animated television series called Voltron. But the punchline is how I remembered it: by reading an article on A List Apart about web standards. Incredible, isn’t it?

]]>
https://grey-panther.net/2007/01/a-childhood-memory.html/feed 0 922
Implementing Web Services with Open Source Software https://grey-panther.net/2007/01/implementing-web-services-with-open-source-software.html https://grey-panther.net/2007/01/implementing-web-services-with-open-source-software.html#respond Fri, 12 Jan 2007 10:51:00 +0000 https://grey-panther.net/?p=925 Today many services are available (both internal and external to a company) as Web Services, more specifically as SOAP. Companies like Microsoft, IBM or Sun have heavily invested in this field and made many of their products compatible with it (as a client and/or as a server). In this article I will study the different possibilities of implementing a SOAP server with Open Source solutions.

The specific requirements are:

  • It should use the HTTP transport layer (the most commonly used in SOAP)
  • It should either have an embedded HTTP server or be usable with Apache
  • It should be platform independent

But let me step back for a moment and ask: why would you want to go this route? Why not use the product of well known companies which offer integration with developer tools and in some cases are available for free? While those products are certainly more mature and easier to use, when going the OSS route you have:

  • more flexibility (because you have the full source code available – and even if you don’t want to actively participate in the development process, it helps a lot for debugging),
  • more deployment options (just think how many webhosts offer Apache / MySQL / PHP / Perl as opposed to IIS, WebSphere or Java)
  • when extending the possible interfacing options of a product written for this platform (adding a SOAP API for a wiki for example) it is easier to use something like this rather than requiring the installation of a whole new framework
  • and finally the issue of the cost: while not a big problem because (a) academic institutions already have or can get free licenses for much of the products and (b) the companies themselves distribute their products (or at least some versions) free, it may still be an argument.

By doing research following these guidelines the following three possibilities emerged:

  • The SOAP::Lite library for Perl

    Advantages:

    • Very easy to use
    • Available across platforms (both from the CPAN and PPM repositories)
    • Has an extensive “cookbook” (set of short HOWTOs): http://cookbook.soaplite.com/
    • Runs in Apache (either as CGI or with mod_perl – in the later case you may need to replace SOAP::Transport::HTTP::CGI with SOAP::Transport::HTTP::Apache in the examples)
    • Has tracing functionality (to enable it at the client side, use the following way to include the library: use SOAP::Lite +trace => 'all'; and then redirect the stderr output (where the tracing info is dumped) to a file like this soap-clien.pl 2>debug-info.txt

    Disadvantages:

    • Does not support automatic generation of the WSDL file
    • Sometimes it insists on sending the variables as a certain type (integer) even though I would like to send them as string
  • The SOAP library included with PHP

    Advantages:

    • Usually readily included with PHP
    • Cross platform
    • Under active development

    Disadvantages:

    • Does not support automatic generation of the WSDL file
    • There are few tutorials for it
    • Very basic debugging support. PHP has a weak debugging support out of the box, but the fact that the majority of the functions are implemented in a binary library makes things even worse (because you would need a hybrid PHP/binary debugger for proper debugging)
  • The NuSOAP project for PHP

    Advantages:

    • Cross platform (written in PHP)
    • Automatic WSDL generation
    • When accessed with a browser, it presents a friendly HTML interface which lists all the published methods / objects and their parameters
    • Distributed as PHP source files which can be easily installed to most hosts (the user doesn't need to ask the server administrator to load extra binary modules)
    • While PHP has no integrated debugging support, the library itself tries to output debugging information. To activate this mode set the debug variable to 1 (like this: $debug = 1;). The debugging information will be appended to the reponse XML as a comment.

    Disadvantages:

    • Not very well maintained (probably because of the existence of the "official" PHP SOAP module)
    • Few examples and many of the examples don't work. For working examples go to the authors webiste
    • Conflicts with the official PHP SOAP module. If you get an error saying something along the lines of can not redefine class soap_client, you have to unload the PHP SOAP module. An other option would be to go through the source and rename this class.
    • No real debugging support (because PHP doesn't have one).

The final choice was NuSOAP. The deciding factor was the (semi)automatic - because you have to give it hints about the parameter types - WSDL generation. This is essential if you wish to make your service available to the largest possible audience, especially those using statically typed languages. A perfect example is the .NET / Visual Studio environment, which needs the WSDL file to automatically generate the stub for the web service.

A little side note: if your web service is accessed through SSL / HTTPS and the certificate authority who
signed the certificate of the server is not trusted (ie. it's not Verisign), you get some warnings while
generating the stub in Visual Studio and the final program will halt with an exception saying something like Could not establish a trusted connection over this SSL/TLS connection. The most common cause of this is the fact that the developer uses a self-signed certificate for the server. As far as I know there is no way to stop this from happening from inside the framework. However, because the framework shares its network access architecture with Internet Explorer, you can correct it from there. First you will need the certificate from the server (a .crt file, server.crt). Then go to Tools->Internet Options and select the Content tab. Click on Certificates, go to the Trusted Root Certification Authorities and select Import. Point it to the server certificate and answer affirmatively to the confirmation dialog. From now on that certificate will also be considered a trusted root certificate, you won't get warnings while browsing sites with it (and those sites might even have elevated privileges - depending on your Internet Explorer configuration), but most importantly your .NET client side code will work just fine.

The test project was to implement a web service which simulated some simple state machine(s). The project was implemented on the following platform:

The test was done on a Windows XP Pro machine using XAMPP to quickly install all the required components, however there is nothing platform specific in the code or the components, so it should be easy to replicate it on a different platform (Linux for example). The PHP code for the server side can be seen in Appendix A and an example for a state machine definition file can be found in Appendix B. An example client program written in C# can be seen in Appendix C.

The structure of a state machine definition file is as follows:

  • The root node is stateMachine. It has one mandatory parameter: initialState which specifies the initial state it is in
  • In the messages section it defines all possible messages (identified by name) which can be sent to this machine. This enumeration is needed to be able to check the validity of the message names provided later to guard against miss-typing.
  • The list of states identified by name. The states can be of two type: auto and message. Those of type auto automatically advance from the current state to the next state depending on the contained action elements. Those of type message wait for a message to advance.
  • The action elements contain the following attributes:
    • nextState - mandatory, the name of the text state if this action is chosen
    • waitBefore and waitAfter - optional, the amount of period to pause before and after executing this actions, in milliseconds. If omitted, zero is assumed. It must be a positive integer.
    • probability - a number greater than 0 but less or equal to 1.0. Determines the probability of this action being chosen. 0 means never and 1 means always. The sum of probabilities for a group (state element for auto states and message element for message states) must be 1.0. If some probabilities are omitted, the remaining probability is distributed amongst them (so if we have 5 action items and the first has a probability of 0.1, the second one of 0.3 and the rest are omitted, the last three will each have a probability of 0.2)

The exported functions by the server are:

  • postMessage(stateMachineName: string, message: string): string

    posts a message to the state machine identified by the stateMachineName. The definition for this state machine must be stored on the server in the file <stateMachineName>.xml. It is assumed that there is only one instance "running" of each server. This is guaranteed by the fact that the state of them is stored in a database table protected by write locks during the transitions. The method is synchronous and returns the name of the current state resulted from processing the message and any other automatic steps (states of type auto) which followed. One thing to keep in mind is that if you specify a state of
    type auto for the initial state, this will also be evaluated at the first message posted. On error it returns an empty string and you can use the getErrorMessage function to get the error message.

  • getMachineState(stateMachineName: string): string

    Gets the current state of the given state machine. It's asynchronous (with respect to the state machine, not the caller). If an error has occurred in the state machine it returns the empty string. You can use the getErrorMessage function to get the error message.

  • getErrorMessage(stateMachineName: string): string

    Returns the error message for the given state machine or empty if no error exists

  • resetMachineState(stateMachineName: string): void

    Resets the machine to its initial state (as specified by the initialState attribute of the correspoding definition file)

  • resetAllMachines: void

    Resets all the state machines to their initial state

Appendix A - PHP Server side code

To install it you would need the following items:

  • The NuSOAP library in the lib subdirectory (or anywhere else, just be sure to adjust the include directive accordingly)
  • The PearDB package for database access
  • Adjust the $data_directory variable so that it points to directory where the XML files describing the state machines are located. Important: include the trailing slash or backslash depending on the platform
  • Create two database tables to store the current state of the automatas (the second table is for locking purposes only, because MySQL doesn't support writing while in a read lock). The SQL statements to create these tables are (you might need to tweak these a little bit to get them to work if you are using something other than MySQL or a different version of it):

    CREATE TABLE `state_machines`.`state_machines` (
    `machine_name` VARCHAR(255) NOT NULL DEFAULT '',
    `machine_state` VARCHAR(255) NOT NULL DEFAULT '',
    `error_message` VARCHAR(255) NOT NULL DEFAULT '',
    PRIMARY KEY(`machine_name`)
    )
    ENGINE = MEMORY;
    
    CREATE TABLE `state_machines`.`lock_table` (
     `dummy_column` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
     PRIMARY KEY(`dummy_column`)
    )
    ENGINE = MEMORY;
    

  • Adjust the connection string accordingly in the DB::connect call

<?php
  require_once 'lib/nusoap.php';
  require_once 'DB.php';
  
  $server = new soap_server();
  $server->configureWSDL('statemachine', 'urn:statemachine');
  
  $data_directory = "C:\xampp\htdocs\webservice\data\";  
  $db_connection = DB::connect("mysql://state_machine:password@localhost/state_machines");
  if (DB::isError($db_connection)) 
    stopWithErrorMessage("failed to connect to the database - " . $db_connection->getMessage());
  
  // Register the methods to expose
  $server->register('postMessage',    
      array('stateMachineName' => 'xsd:string', 'message' => 'xsd:string'),
      array('return' => 'xsd:string'),
      'urn:statemachine',             
      'urn:statemachine#postMessage', 
      'rpc',                          
      'encoded',                      
      'send a message to a given state machine. returns the new state'      
  );
  $server->register('getMachineState',    
      array('stateMachineName' => 'xsd:string'),
      array('return' => 'xsd:string'),
      'urn:statemachine',             
      'urn:statemachine#getMachineState', 
      'rpc',                          
      'encoded',                      
      'returns the current state of the automaton'      
  );
  $server->register('getErrorMessage',    
      array('stateMachineName' => 'xsd:string'),
      array('return' => 'xsd:string'),
      'urn:statemachine',             
      'urn:statemachine#getErrorMessage', 
      'rpc',                          
      'encoded',                      
      'returns the error message for a given state machine ("" if no error exists)'      
  );  
  $server->register('resetMachineState',    
      array('stateMachineName' => 'xsd:string'),
      array(),
      'urn:statemachine',             
      'urn:statemachine#resetMachineState', 
      'rpc',                          
      'encoded',                      
      'resets the given state machine'      
  );  
  $server->register('resetAllMachines',    
      array(),
      array(),
      'urn:statemachine',             
      'urn:statemachine#resetMachineState', 
      'rpc',                          
      'encoded',                      
      'resets all the state machines'      
  );    

  $server->service($HTTP_RAW_POST_DATA);

  //send a message to a given state machine. returns the new state
  function postMessage($stateMachineName, $message) {
    if (!preg_match('/^[w-s]+$/', $stateMachineName)) 
      stopWithErrorMessage('state machine name contains illegal characters');
    global $data_directory;
    if (!is_file($data_directory . $stateMachineName . ".xml")) 
      stopWithErrorMessage('specified state machine does not exists');
      
    //load up the state machine
    $stateMachine = loadStateMachineFromFile($data_directory . $stateMachineName . ".xml");
    global $db_connection;
    //from now on we need to be synchronized with other threads - lock the database table
    $db_connection->query('LOCK TABLE lock_table WRITE');
    //synchronize it with the database
    $stateMachine = synchronizeStateMachine($stateMachine, $stateMachineName);
    //now post the message to the state machine
    $stateMachine = postMessageWhilePossible($stateMachine, $message);
    if (is_string($stateMachine)) {
      //an error has occured! store the error message and return the empty string
      $db_connection->query('REPLACE INTO state_machines (machine_state, error_message) VALUES ("", "' . addslashes($stateMachine['currentState']) . 
        '") WHERE machine_name="' . addlashes($stateMachineName) . '"');
      $db_connection->query('UNLOCK TABLES');
      return '';
    } else {
      //everything went ok, store the new state and return it
      $db_connection->query('REPLACE INTO state_machines (machine_state, error_message) VALUES ("' . addslashes($stateMachine['currentState']) . 
        '", "") WHERE machine_name="' . addslashes($stateMachineName) . '"');
      $db_connection->query('UNLOCK TABLES');
      return $stateMachine['currentState'];
    }     
  }
  
  //returns the current state of the automaton
  function getMachineState($stateMachineName) {
    if (!preg_match('/^[w-s]+$/', $stateMachineName)) 
      stopWithErrorMessage('state machine name contains illegal characters');
    global $data_directory;
    if (!is_file($data_directory . $stateMachineName . ".xml")) 
      stopWithErrorMessage('specified state machine does not exists');
      
    //load up the state machine
    $stateMachine = loadStateMachineFromFile($data_directory . $stateMachineName . ".xml");
    //synchronize it with the database
    $stateMachine = synchronizeStateMachine($stateMachine, $stateMachineName);
    //return the current state
    return $stateMachine['currentState'];
  }
  
  //returns the error message for a given state machine ('' if no error exists)
  function getErrorMessage($stateMachineName) {
    if (!preg_match('/^[w-s]+$/', $stateMachineName)) 
      stopWithErrorMessage('state machine name contains illegal characters');
    global $db_connection;
    return $db_connection->getOne("SELECT error_message FROM state_machines WHERE machine_name="" . addslashes($stateMachineName) . """);
  }
  
  //resets the given state machine
  function resetMachineState($stateMachineName) {
    if (!preg_match('/^[w-s]+$/', $stateMachineName)) 
      stopWithErrorMessage('state machine name contains illegal characters');
    global $db_connection;
    $db_connection->query('DELETE FROM state_machines WHERE machine_name="' . addslashes($stateMachineName) . '"');
  }
  
  //resets all the state machines
  function resetAllMachines() {
    global $db_connection;
    $db_connection->query('DELETE FROM state_machines');
  }
  
  //internal helper function which outputs the error message to the header and then exits
  function stopWithErrorMessage($error_message) {
    header("HTTP/1.1 500 Internal Server Error: " . $error_message, true, 500);
    exit;     
  }
  
  //internal helper function - synchronizes the state of a an automaton with the one stored in the databas
  //(if it's stored there)
  function synchronizeStateMachine($state_machine, $stateMachineName) {
    global $db_connection;
    $machine_state = $db_connection->getRow("SELECT * FROM state_machines WHERE machine_name="" . addslashes($stateMachineName) . """);
    if (is_array($machine_state)) {
      //it is present in the database, try to synchronize with it
      if ( ('' == $machine_state['machine_state']) || array_key_exists($machine_state['machine_state'], $state_machine['states'])) {
        $state_machine['currentState'] = $machine_state['machine_state'];
        return $state_machine;
      } else {
        stopWithErrorMessage("Erroneous state in the database: " . $machine_state['machine_state']);        
        exit;
      }
    }
    //it's not present in the database, leave it as it is
    return $state_machine;
  }
  
  //internal helper function. Checks if the given node has the specified attribute
  //if not, returns null, if it does, it returns it
  function getAttributeOrNull($node, $attr_name) {
    if ($node->hasAttribute($attr_name))
      return $node->getAttribute($attr_name);
    return null;      
  }
  
  //internal helper function. Extract and validate the action elements from
  //a node (message or state). Return an array structure on success,
  //an error messag (string) on failure
  function extractActions($parent_node, $states_list, $state_name, $message_name) {
    //process the probable actions - make sure that the sum of the probabilities if 1.0
    //when no probaility is specified, the remaining probability is distributed between them
    $error_message_suffix = ('' == $message_name) ? '' : " at message '$message_name'";
    $actions = $parent_node->getElementsByTagName('action');
    $result = array();
    $probability_sum = 0.0; $actions_with_no_probability = 0;
    foreach ($actions as $action) {
      $new_action = array();
      if (null !== ($action_probability = getAttributeOrNull($action, 'probability'))) {
        if ($action_probability <= 0)
          return "Negative probability of action in state '$state_name'$error_message_suffix";
        $probability_sum += 0.0 + $action_probability;
        $new_action['probability'] = 0.0 + $action_probability;
      } else {
        ++$actions_with_no_probability;
      }
      if (null === ($action_wait_before = getAttributeOrNull($action, 'waitBefore'))) {
        $new_action['waitBefore'] = 0;
      } else {
        if ($action_wait_before < 0)
          return "Negative 'waitBefore' of action in state '$state_name'$error_message_suffix";
        $new_action['waitBefore'] = intval($action_wait_before);
      }
      if (null === ($action_wait_after = getAttributeOrNull($action, 'waitAfter'))) {
        $new_action['waitAfter'] = 0;
      } else {
        if ($action_wait_after < 0)
          return "Negative 'waitAfter' of action in state '$state_name'$error_message_suffix";
        $new_action['waitAfter'] = intval($action_wait_after);
      }     
      if (null === ($action_next_state = getAttributeOrNull($action, 'nextState')))
        return "Unspecified nextState in state '$state_name'$error_message_suffix";
      if (!array_key_exists($action_next_state, $states_list))
        return "Invalid nextState specified in action at state '$state_name'$error_message_suffix: '$action_next_state'";
      $new_action['nextState'] = $action_next_state;
                  
      $result[] = $new_action;
    }
    //now redistribute the remaining probability :)
    if ($actions_with_no_probability > 0) {
      foreach (array_keys($result) as $action_key) {
        if (!array_key_exists('probability', $result[$action_key])) {
          $result[$action_key]['probability'] = (1.0 - $probability_sum) / $actions_with_no_probability;
        }
      }
    }
    //finally sum up the probability and check it (must be 1.0)
    $probability_sum = 0.0;
    foreach ($result as $action)
      $probability_sum += $action['probability'];
    if (abs(1.0 - $probability_sum) > 0.001)
      return "The sum of probabilities for state '$state_name'$error_message_suffix is way off from 1.0"; 
    
    return $result;
  }

  //returns a structure which completly describes the state machine
  //returns a string with an error message if the XML failed to follow
  //the rules
  function loadStateMachine($state_machine_xml) {
    $doc = new DOMDocument();
    $doc->loadXML($state_machine_xml);    
    
    //this will be the result is all goes well
    $result = array();
    
    //find all the valid message
    $xpath = new DOMXPath($doc);
    $valid_messages = array();
    foreach ($xpath->query('//stateMachine/messages/message') as $valid_message) {
      if (null === getAttributeOrNull($valid_message, 'name'))
        return "Found message element which doesn't have the 'name' attribute!";
      $valid_messages[getAttributeOrNull($valid_message, 'name')] = 1;
    } 
    if (0 >= count($valid_messages))
      return 'No valid message names found!';   
          
    //now parse states
    $result['states'] = array();
    $states = $doc->getElementsByTagName('state');
    if (0 >= $states->length)
      return 'No state elements found!';
    
    //first store the state names so that we can validate them later on
    foreach ($states as $state) {
      if (null === ($state_name = getAttributeOrNull($state, 'name')))
        return 'Found state with no name!';
      if (array_key_exists($state_name, $result['states']))
        return "Found state with duplicate name: '$state_name'";
      $result['states'][$state_name] = array();
    }
    
    foreach ($states as $state) {
      //validate the basic parameters for the state
      $state_name = getAttributeOrNull($state, 'name');
      if (null === ($state_type = getAttributeOrNull($state, 'type')))
        return 'Found state with no type!';
      if ( ('message' != $state_type) && ('auto' != $state_type) )
        return "Found state with invalid type: '$state_type'";
        
      //save the validated stuff
      $result['states'][$state_name]['type'] = $state_type;
      
      //process the available state transitions
      if ('message' == $state_type) {
        $messages = $state->getElementsByTagName('message');
        $result['states'][$state_name]['messages'] = array();
        foreach ($messages as $message) {
          //message name: - it should exists, - it should be valid and - it shouldn't be used before (in this state)
          if (null === ($message_name = getAttributeOrNull($message, 'name')))
            return "Found message with no name in state '$state_name'";
          if (!array_key_exists($message_name, $valid_messages))
            return "Found invalid message name '$message_name' in state '$state_name'";
          if (array_key_exists($message_name, $result['states'][$state_name]['messages']))          
            return "Found duplicate message name '$message_name' in state '$state_name'";
          $result['states'][$state_name]['messages'][$message_name] = array();
          
          $result['states'][$state_name]['messages'][$message_name]['actions'] = 
            extractActions($message, $result['states'], $state_name, $message_name);
          if (is_string($result['states'][$state_name]['messages'][$message_name]['actions']))
            //an error has occured
            return $result['states'][$state_name]['messages'][$message_name]['actions'];
        }       
      } else {
        //load the actions we can chose from
        $result['states'][$state_name]['actions'] = 
          extractActions($state, $result['states'], $state_name, '');
        if (is_string($result['states'][$state_name]['actions']))
          //an error has occurred
          return $result['states'][$state_name]['actions'];
      }
    }
      
    //get the starting state and make sure that it's a valid state
    if (null === ($initial_state = getAttributeOrNull($doc->documentElement, 'initialState')))
      return 'Document has no initial state!';      
    if (!array_key_exists($initial_state, $result['states']))
      return 'Initial state is invalid!';   
    $result['currentState'] = $initial_state;   
    
    return $result;
  }
  
  function loadStateMachineFromFile($file_machine) {
    $xml_file_contents = file_get_contents($file_machine);
    if (get_magic_quotes_gpc()) $xml_file_contents = stripslashes($xml_file_contents);
    return loadStateMachine($xml_file_contents);
  }
  
  //applies a given message to a given state machine. it executes the specified waits
  //returns the modified state machine. If an error occured, the currentState will be set to ''
  function internalPostMessage($state_machine, $message = '') {
    //print "Applying message: '$message'n";
    //print "Current state: $state_machine[currentState]n";
    
    //we are in an invalid state - we can't do anything
    if (!array_key_exists($state_machine['currentState'], $state_machine['states']))
      return $state_machine;
        
    if ('message' == $state_machine['states'][$state_machine['currentState']]['type']) {
      if (!array_key_exists($message, $state_machine['states'][$state_machine['currentState']]['messages'])) {
        //this message can not be applied now
        $state_machine['currentState'] = '';
        return $state_machine;
      }
      $actions = $state_machine['states'][$state_machine['currentState']]['messages'][$message]['actions'];
    } else {
      $actions = $state_machine['states'][$state_machine['currentState']]['actions'];
    }
        
    //now chose an action, by "throwing a dice"
    $rand_value = rand(0, 32768) / 32768;
    $action_to_execute = null;
    foreach ($actions as $action) {
      $action_to_execute = $action;
      if ($rand_value <= $action['probability'])        
        break;      
      $rand_value -= $action['probability'];
    }   
    
    //print "Going to state $action[nextState]n";
    //now execute the action        
    sleep(intval($action['waitBefore'] / 1000 + 0.5));
    $state_machine['currentState'] = $action['nextState'];    
    sleep(intval($action['waitAfter'] / 1000 + 0.5));
    //print "Gone to state $action[nextState]n";   
    
    return $state_machine;
  }
  
  //the same as above, however it continues while possible after the first move
  //(while the current state is an automatic one)
  function postMessageWhilePossible($state_machine, $message) {
    //process any automatic statest BEFORE
    while ( ('' != $state_machine['currentState']) &&
      ('auto' == $state_machine['states'][$state_machine['currentState']]['type']) ) {
      $state_machine = internalPostMessage($state_machine);
    }
    if ('' != $state_machine['currentState'])
      $state_machine = internalPostMessage($state_machine, $message);
    //process any automatic statest AFTER
    while ( ('' != $state_machine['currentState']) &&
      ('auto' == $state_machine['states'][$state_machine['currentState']]['type']) ) {
      $state_machine = internalPostMessage($state_machine);
    }
    return $state_machine;
  } 
?>

Appendix B - Example state machine file

<stateMachine initialState="Off">
  <messages>
    <message name="Flip" />
  </messages>
  <state name="Off" type="message">
    <message name="Flip">
      <action probability="0.5" nextState="Transient" />
      <action probability="0.5" nextState="Off" />
    </message>
  </state>
  <state name="Transient" type="auto">
    <action probability="0.9" nextState="On" waitBefore="1000" />
    <action probability="0.1" nextState="Off" waitBefore="1000" />  
  </state>
  <state name="On" type="message">
    <message name="Flip">
      <action nextState="Off" />  
    </message>
  </state>
</stateMachine>

Appendix C - Example client program in .NET (C#, VB .NET, Delphi and Managed C++)

Before you can use these examples, you have to add a Web Reference to your project. You can do this by right-clicking on your Reference folder in your Visual Studio and selecting Web Reference. You should put in the link with a ?wsdl appended (to get the WSDL file). For example if you are hosting the service locally, you would put in http://localhost/webservice/index.php?wsdl

C#
private static void Main(string[] args)
{
      Console.WriteLine("Startin up...");
      statemachine statemachine1 = new statemachine();
      Console.WriteLine("Startup done...");
      for (int num1 = 0; num1 < 10; num1++)
      {
            Console.WriteLine("The current state is: " + statemachine1.getMachineState("testAutomata"));
            Console.WriteLine("Passing message: Flip");
            Console.WriteLine("Test automata returned: " + statemachine1.postMessage("testAutomata", "Flip"));
            Console.WriteLine("---");
      }
      Console.WriteLine("Press any key to exit...");
      Console.ReadKey();
}
VB .NET
Private Shared Sub Main(ByVal args As String())
      Console.WriteLine("Startin up...")
      Dim statemachine1 As New statemachine
      Console.WriteLine("Startup done...")
      Dim num1 As Integer = 0
      Do While (num1 < 10)
            Console.WriteLine(("The current state is: " & statemachine1.getMachineState("testAutomata")))
            Console.WriteLine("Passing message: Flip")
            Console.WriteLine(("Test automata returned: " & statemachine1.postMessage("testAutomata", "Flip")))
            Console.WriteLine("---")
            num1 += 1
      Loop
      Console.WriteLine("Press any key to exit...")
      Console.ReadKey
End Sub
Delphi
procedure Program.Main(args: string[]);
begin
      Console.WriteLine('Startin up...');
      statemachine1 := statemachine.Create;
      Console.WriteLine('Startup done...');
      num1 := 0;
      while ((num1 < 10)) do
      begin
            Console.WriteLine(string.Concat('The current state is: ', statemachine1.getMachineState('testAutomata')));
            Console.WriteLine('Passing message: Flip');
            Console.WriteLine(string.Concat('Test automata returned: ', statemachine1.postMessage('testAutomata', 'Flip')));
            Console.WriteLine('---');
            inc(num1)
      end;
      Console.WriteLine('Press any key to exit...');
      Console.ReadKey
end;
Managed C++
private: static void __gc* Main(System::String __gc* args __gc [])
{
      System::Console::WriteLine(S"Startin up...");
      ContactWebservice::stateMachine::statemachine __gc* statemachine1 = __gc new ContactWebservice::stateMachine::statemachine();
      System::Console::WriteLine(S"Startup done...");
      for (System::Int32 __gc* num1 = 0; (num1 < 10); num1++)
      {
            System::Console::WriteLine(System::String::Concat(S"The current state is: ", statemachine1->getMachineState(S"testAutomata")));
            System::Console::WriteLine(S"Passing message: Flip");
            System::Console::WriteLine(System::String::Concat(S"Test automata returned: ", statemachine1->postMessage(S"testAutomata", S"Flip")));
            System::Console::WriteLine(S"---");
      }
      System::Console::WriteLine(S"Press any key to exit...");
      System::Console::ReadKey();
}

Appendix D - Example client written in Perl

use warnings;
use diagnostics;
use SOAP::Lite +trace => 'all';

print ">>" . SOAP::Lite
  -> uri('http://www.soaplite.com/Demo')
  -> proxy('http://localhost/webservice/index.php')
  -> getMachineState("testAutomata")
  -> result;
]]>
https://grey-panther.net/2007/01/implementing-web-services-with-open-source-software.html/feed 0 925
Tracking web users https://grey-panther.net/2006/11/tracking-web-users.html https://grey-panther.net/2006/11/tracking-web-users.html#comments Sat, 04 Nov 2006 15:26:00 +0000 https://grey-panther.net/?p=1025 Again, this will be something new here (at least for me): I’ll publish a pre-rant for Security Now! Steve Gibson expressed interest in the subject of cookies, so I’ll tackle that in this post and also the more general question of user-tracking. I discuss different ways it can be accomplished, ways you could protect yourself and the question: should you?

In a way the World Wide Web is a marketing companies wet dream: just image, tracking the moves of the users, building a profile which lists their potential interests (as it can be inferred from the list of visited sites and the frequency of the visits). Using this they can show ads which they consider will be relevant to us. Of course they don’t do this out of the goodness of their hard. They do it because you have a higher probability of reacting to the advertisement if it’s relevant to you.

Here are the means I know of which can be used to accomplish this:

  • Tracking cookies or third party cookies – this is IMHO a bad name (from a technical point of view), and I’ll explain in a minute why. But first lets answer the question: what are cookies? Cookies (or HTTP State Management Mechanism as it is referred to by the official RFC) are opaque tokens (from the point of view of the client) which contain some information which helps the server side application identify the fact that different HTTP requests are part of the same session. This is necessary, since the HTTP protocol does not define any method for creating, tracking and destroying sessions. That is, whenever you request an object from the web server it will treat it as separate request, having no idea what you requested earlier. The cookie is used as token in the following way: the server says to the client take this piece of information and return it to me on subsequent requests. This way it can determine if the request is part of the same session (because it can hand out a different value to each client and when the client returns the information, it can identify the session it is part of). Before you ask: you can’t use IP addresses as a reliable unique identifier because of proxies and NATs. You can observe two things here: this behavior is entirely voluntary on the clients part (it may choose not to return the token) and that it applies to every HTTP transaction, not just HTML documents (including images, flash animation, java applets, etc). Of course the standard defines a policy which specifies in which requests should the cookie be returned. The elevator speech version of this is: cookies will only be sent back to requests targeted at the server it was originally sent from and to elements the path of which is prefixed by the path contained in the cookie (for example if the cookie was set by the object located at http://example.com/set/a/cookie it will be sent in all requests which are targeted at the example.com server and contain in the url /set/a/cookie). Now how is this used to track you from site to site if the cookie is only returned to the server it was originally sent from? Enter the advertisement companies: they serve up ads from the same server to many webpages. This means that those webpages contain links to elements (usually images, flash animation or javascript) which reside on the server of the advertiser. This means that if you view a page which contains advert from a given company, it can set a cookie, which will later be sent back to it when you view an other page (possibly from an other server) which contains advert from the same company (because in both cases the object – image, flash, whathever – came from the same source the cookie was set – the server of the advertiser). This is called a third party cookie because it is set by a different entity than the server you see in your address bar. However I think that this is a bad name since it implies that some kind of spoofing is going on, like a server is setting a cookie for an other server – which by the way is explicitly prohibited by the standard and won’t work in any modern browser. To sum up:
    • Applicability: (almost) every browser supports it. The standard itself if relatively old (almost 10 years)
    • Customizability: Current browsers offer ways to set a policy on what cookies should / should not be accepted both in a whitelist and blacklist format. Usually they do not include the option to view the cookies stored on the machine, but there are many free third party tools / extensions which enable you to do this.
    • Risk of disabling it: if cookies are disable altogether, many sites which have a member-only area will break and the user will be unable to log-in. Disabling of third party cookies breaks pages which host elements fetched from a third party server (which represents a small but growing percentage of the web in the age of mashups)
  • Flash Local Shared Objects (AKA flash cookies) – As of version six (also called Flash MX) a feature was introduced in the Flash Player to store information which had to preserved across different page loads locally on the users computer. Before that sites used a combination of javascript, cookies and actionscript to obtain the same effect. Flash Local Shared Objects have the same restrictions as cookies for forwarding (i.e. they’re only sent to flash movies which originate from the same server). Because this was a little known feature outside of the Flash developer community and the interface was hidden and because of the scaremongering many users started to remove or disable cookies, advertisers started to use it instead of cookies.
    • Applicability: on any platform which has at least version 6 of the Flash Player installed.
    • Customizability: you can go to the site of Adobe to completely disable or to manage the shared objects which are on your computer. There is also a Firefox extension, however it seems dated and not maintained any more, so probably the safest bet is to go with the official links provided above.
    • Risk of disabling it: sites which rely on it may break, however I didn’t found any sites until now which relied on it for other purposes than tracking, so currently it may be disabled without any problems. This may change in the future however.
  • Referrer URLs – Referrer URLs is a piece of information sent by your browser when requesting an object from a web server. For example if you click a link at http://foo.com/link.htm which takes you to http://bar.com/target.htm, the bar.com webserver will receive as part of the request (if you didn’t disable it in your browser) the string http://foo.com/link.htm as the referrer. This can (and is) used by sites for statistical purposes (to see who links to them) and for security (however this is a pretty weak form of security since it relies on the client playing it straight and thus it can be spoofed. One thing which makes the privacy advocates suggest to turn this feature off is the fact that if you go to a page from a search engine (that is, you searched for bar.com on google and then clicked on one of the results), the target server can know the words you searched for (since it will be embedded in the referrer url). However, this information isn’t forwarded to the advertisers unless the use third party javascript to get it (which I’ll talk about later on). That is if you go: Google -> Google search results -> foo.com -> (automatically, because it is embedded in the page at foo.com) advertiser. The referrer transmitted at the last step (that is from foo.com to the advertiser) if foo.com (meaning that the only information that the advertiser gets is the fact that the ad was loaded from foo.com, not the way by which the user arrived to foo.com. I want to stress this because Steve Gibson got this wrong on episode 64 of the Security Now podcast. (I want to stress again that advertisers can get the referrer of the page which includes the advertisement by using third party javascript which I’ll talk about shortly).
    • Applicability: on almost every browser
    • Customizability: you can see a tutorial about enabling it here which should point you in the right direction.
    • Risk of disabling it: you shouldn’t encounter any problems because few sites use it for other purposes than statistics, but if you don’t mind, give them this piece of information, it can be used to create better content for you!
  • Third party javascript – usually when a site collaborates with a given advertiser, it is asked to put a piece of HTML in every page where s/he want the ads to be displayed. This code is usually an IFRAME tag or a SCRIPT tag. In the later case we talk about third party scripts – javascript code which is provided by a third party and runs in the context of the current page. This code can do almost everything, including the following things: access the referrer of the current page (so even if it isn’t directly relied to the advertisement server, the script can forward it), get information about the browser capabilities (screen resolution, etc) and perform history digging (see the next point).
    • Applicability: on every browser which understands javascript.
    • Customizability: in Firefox you can use the NoScript extension. In Internet Explorer you can add the sites you want to block scripts from in the Restricted Sites Zone. An other solution would be to disable javascript entirely, but this will reduce the usability of many sites.
    • Risk of disabling it: mashups use heavily third party javascript (to embed Google Maps for example). Also some big sites host their script files on different servers than the content (to be able to optimize the servers for the specific types of files), so you can’t say generally that everything third party is bad.
  • History digging – This is a really cool technique, reported first as far as I can tell by Jeremiah Grossman and was later tweaked to work with IE. It is based on the fact that visited links have different styles than non-visited links (this is usually observed as different colors). If you put a bunch of links on a page and then use javascript to inspect the styles applied to them by the browser, you can tell if the given sites are in the history of the browser.
    • Applicability: there is proof of concept code for Firefox and IE. It should work in any browser which has a standard conformant implementation of javascript and DOM.
    • Customizability: you can’t programatically disable just this feature. Your options are: (a) disabling javascript (b) cleaning your history before you visit sites you suspect are doing this. One important fact: if an advertiser embeds javascript on the site the ad is displayed on, it can use this technique to find out if you visited a given site. Fortunately there is a mitigating factor: in order for somebody to find out if you visited a given page s/he has to know the exact url of the page (that is this method can not be used to enumerate the entries of your history)
  • Sign-in information – an often overlooked fact by people is that the big three identity providers (Google, Yahoo and MSN) also provide advertising. Because of this they can correlate tracking information obtained by any of the methods listed above with the personal information you provided at signup. Now I’m not saying that they do this, I’m just saying that they have the technical means to do it.
    • Applicability: if you are a user of any of these sites and browse sites – while you are logged on – which display advertisement from them, you are affected.
    • Customizability: log off before browsing to other sites and clear all the cookies from them. Before logging back in also clear the cookies from them placed there by the ads.
    • Risk of disabling it: the inconvenience of constantly having to clear cookies.

Now for the philosophical question: should you be worried? Should you go to great length to avoid this tracking, even at the cost of breaking useful features on the site? You should consider the following ideas (they are not absolute truths, but arguments which are used in this debate):

  • Nothing is free and advertisement is an (arguably) quick and (mostly) painless way of payment for the content / service. So disabling advertisement can be thought of as a way of cheating to get what you desire without payment)
  • Contextual ads can be useful. For example if I would like to buy a laptop and I see an ad for laptop, I will most probably click it. This is useful for both parties: for me because possibly I learn about an offer I didn’t know about and for the company who put out the ad, because I might buy something from then.
  • Some people say: but this is not right! The user should be in control! If you want to buy laptops, search for them yourself! Of course no rational person (no offense to anybody) would buy something of significant value based on one ad (because usually it’s only showing one detail of the product – probably not mentioning the not-so-bright sides) but it may add value to your research. So, while you shouldn’t buy based on what they say on the teleshopping channel – err I mean ad 🙂 – it may add value to your research while you are considering your options.
  • The tinfoil hat people may say: I don’t want the government / Amazon / Google / whatever track my every movement! I have a right to privacy! – and they are right, they do have a right to privacy, however they must be willing to give up certain benefits or to make some additional steps. And before you object saying: why do I have to make extra efforts to get the same service everybody receives while keeping my information as private as possible? – just consider how things work in the real world – if you want to drive a car, you must get a license. It is your right to drive a car (if you are of legal age), however you still have to get a license. Because every analogy breaks down, lets consider the technical point of view: every technology can be used for good an bad (this is even more so if there is no clear distinction between good and bad). The only way of preventing 100% of the bad usages of a technology is to ban it all together. You may choose this, but be aware that you are not getting the benefits either. Now some of the technologies (like session cookies) can be emulated by other technologies (like appending the SID – the session identifier to every request as a GET parameter), however the given technology was introduced to make it easier to accomplish certain tasks without the complication and hassle the old method needed. Guess, what a rational website owner / creator would do: use the more complex, less reliable and more expensive technology for a very little percent of its visitors or go with the easier and more powerful technology?
]]>
https://grey-panther.net/2006/11/tracking-web-users.html/feed 1 1025
Web Developer Stereotypes https://grey-panther.net/2006/10/web-developer-stereotypes.html https://grey-panther.net/2006/10/web-developer-stereotypes.html#respond Wed, 18 Oct 2006 07:02:00 +0000 https://grey-panther.net/?p=1035 Sitepoint did a survey amongst web developers and found that people who use PHP are very likely to try Ruby on Rails. While I haven’t completed the survey myself, I find that I’m in this exact same position: I’ve been developing in PHP for several years now and plan to check out Ruby, however I’m reluctant to do anything big in it until I understand the full extent of the magically generated code and the security implications it has.

Credits: I came across this link on the Tucows blog.

]]>
https://grey-panther.net/2006/10/web-developer-stereotypes.html/feed 0 1035