Financial data aggregators today rely on old models for aggregation, using screen scraping techniques and storing user credentials. There are better integration models for cross application integration over the internet (from both a data quality and security perspective), OAuth feels like a better fit for this type of scenario.

Using the stored credentials model, aggregators will ask for credentials to online banking websites. Here is an example of Mint.com asking for a username and password to get data about an american express card from Bank of America:

You’ll need to repeat the above process for each Financial Institution you want to aggregate. This means the aggregator will have access to your banking credentials at all the sites you do banking. Of course aggregators use military grade encryption and security procedures (we hope), but there is still a small chance of your credentials being compromised (there is always some limited risk with every system).

There are several problems with this model, most significant are:

  1. Credentials are stored in aggregators database.
  2. Data quality extraction is fragile because it uses screen scraping. Easy to break if the HTML or page layout changes at the source site.
  3. The level of permission is not controllable. It’s all or nothing. You can’t allow permission to just read accounts. Once you hand over the credentials the scripts can do whatever they want.

A better model might move the authorization control to the source system. Let the source system control what an external system (aggregator) can access. Let the user of the source system control the level of permissions allowed (account balance, transaction history, etc.). Now the sequence looks like this:

The user has authorized the aggregator to access their account data from Online Banking Site A, but disabled access from Online Banking Site B. The aggregator would try to synchronize account data from all OAuth providers and would be successful with Bank A, but not with bank B, because the user disabled access.

This model moves authorization control to the source system. It also prevents the need for the aggregator to store credentials on their end. Aggregators get the additional benefit of quality data, since they are now using a real API, instead of screen scraping.

I could imagine a set of permissions that aggregators (or other banking applications) would need to request access to when a user authorizes an application. These could include:

  1. Read Only Account List
  2. Read Only Transaction History
  3. Read Only Upcoming Bills
  4. etc…

I can’t imagine most Financial Institutions would want to expose write permissions, e.g. money movement operations, but that’s a theoretical option.

A quick google search revealed someone else thinking about the same thing, they actually asked some of their Financial Institutions, but didn’t get much of a response. My guess is the banking industry is in a “lets wait and see what happens” mode with this sort of stuff, but I hope to soon see a “lets actually do something about this” mode.


Last week I attended SQL Saturday in Portland where I watched a very inspiring presentation on multi-threaded TSQL programming by John Huang.  I’m going to provide a high level overview of the presentation with the parts I thought were most interesting.

SQL Server has a feature called Service Broker which allows you to send work into a Queue. This is a durable queue, meaning it is backed by a database so your work won’t get lost if a server goes offline (which could happen with a more volatile queue like a memory queue).

SQL Server also supports a number of lock types, understanding how these locks work is the key to multi-threading with TSQL. The solution John outlined is one built around queues and locks to control parallel workers pulling from the queue.

A simple queue model

Consider a table of work (SQL to be executed) that contains a task Id, and a status. A continuous loop runs over this table grabbing rows that are ‘Not Started’ and executing the SQL.

Now consider the following TSQL:

Notice the entire statement is wrapped in a transaction. The first operation is an update on the first row where the status is ‘Not Started’. This will lock that row with an update lock. Also notice the WITH(READPAST) hint - this is what allows other parallel workers to not be blocked by this update, instead they other workers can read the next available row and start working it. 

A queue model with task dependency support

Now consider a scenario where you wanted to control not just the order tasks are executed in, but also ensure dependent tasks are executed in the correct order. What you need is a model that expresses what tasks are dependent on other tasks. With a slight adjustment to the above table, we get something like this:

Now the locking gets a bit more complicated because you have to ensure dependent tasks have completed before executing a task. The solution to this problem is implemented by using two specific types of locks in SQL Server:

  1. Shared Lock - Used for read operations that do not change or update data, such as a SELECT statement.
  2. Intent Exclusive Lock - Used to establish a lock hierarchy. The types of intent locks are: intent shared (IS), intent exclusive (IX), and shared with intent exclusive (SIX).

The flow for a single thread pulling work out of the queue and executing it:

What’s happening here is:

  1. Start a transaction
  2. Get the message out of the Queue
  3. Get a shared lock on the current unit of work
  4. Commit the transaction
  5. Get an intent exclusive lock on the dependent unit of work
  6. Execute the unit of work
  7. Release shared lock
  8. Release the intent exclusive lock

That’s basically the model for controlling work in a parallel fashion using TSQL. Refer to John’s Multi-Threading TSQL presentation for more details. This includes a slide deck and code samples.


Quick answer: no.

Bitcoin is a peer-to-peer digital currency. There is a fixed number of digital coins that can be created. They are created by mining (running hash functions over and over). There are only a fixed number of coins that can be produced per hour, something like 10 per hour. Because so many people are mining coins you dont have a chance to be the first to mine a single coin anymore, so you have to join a “mining pool”. In the minig pool, you split profits with others in the pool. Here is an example site doing this:

http://www.bitcoinplus.com/generate

Depending on your computer hardware you will mine at a different rate. I tested this on a computer with 4 cores.

Given the above statistics around estimated time per payout, I pluged this into a spreadsheet.  

This means you would need to leave your 4 core PC running all day every day for a year, and would only end up making less than a dollar and fifty cents each year. You would obviously pay way more in electricity, etc. to run your PC that long. So basically you can’t make money unless you are able to lower the hours/payout variable substantially.

The hours/payout variable will continue to increase as it become more and more difficult to mine bitcoin over time, because of more miners and a lower rate of bitcoin per hour allowed to be created (there is a diminishing rate over time, as per the bitcoin algorithm). 

The people who created and first started mining bitcoins were able to generate tons of them, then as more people started mining the amount per miner lowered. This is basically a ponzi scheme, where the people who get in first make money of the people who get in later. If one of the first miners was able to generate a million bitcoins, they could sell them for dollars today and make $4.7 million, assuming there was liquidity (which there is probably not). 

The bottom line is you can’t make money mining bitcoin, but the people who did it in the start have the potential to make tons of money. Great for the inventors of the system, but not much value (if any) for new miners.


Active Web Solutions has built an alerting platform that interfaces with a global satellite network for emergency notifications. The system uses Windows Azure to process events through a number of queues/worker roles, etc… This video gives a high level overview of their architecture. Kudos to them for being open kimono about the whole thing :-)


I just released a Visual Studio project template for building WCF service contracts. With a simple contract definition (xml file) you can generate an entire service contract. This project uses XSLT to transform your service definition into C# source code.

To get started download the Service Contract project template here.

Open the file named ContractDefinition.xml. Fill out the definition of your service. In this example we’ll create a banking service with three operations: GetAccounts, GetTransactionHistory, and Transfer.

Now build your project and check out the generated C# code. The file ContractDefinition.cs is nested under the xml file.

At this point you have just generated the following things:

  1. Service interface
  2. Service messages (request/response)
  3. Service faults
  4. Service client
  5. Unit tests
  6. Base service implementation

Here’s a class view of what got generated:

You’ll notice all the classes generated are partial. The intention is that you create the other partial classes for things like request and response messages, where you add data members, etc. 

If you need to add another operation simply update the xml file and rebuild.

Enjoy.


A typical scenario when integrating systems is the “check something on system A, then update system B” scenario. 

When you have complete control over both systems you can do this in very efficient ways, like table level joins in a database. However, if you don’t have complete control over both systems, you’ll be limited to the API’s of both systems. It’s this case where you might find you need to call an operation on system A over and over for context specific to system B.

Consider the following scenario:

  • You want to monitor content on the web about a set of keywords.
  • You want to index documents found for these keywords.
  • You have a web search API, but it only allows for real time searches.
  • You need a way to continuously execute real time searches, while updating a local document database with new results.

Now also consider the following constraints:

  • The search API you have only allows for 10,000 searches per day.
  • You want to distribute your searches throughout 24 hours, such that you don’t cross the daily search threshold.

To do this you’ll need a way to control the rate at which you send search requests to the search service. One solution to this problem is applying a throttled queue. In this solution you run two separate threads in parallel:

  1. One thread continuously generates search requests to process. These requests are sent to a processing queue where they will wait for a processing thread to pick them up and process them.
  2. Another thread continuously monitors the processing queue for new search requests, executing requests as they come into the queue (at a controlled rate).

It turns out this is a very typical pattern for a lot of different scenarios. Some other examples include:

  • A banking application that continuously looks for upcoming bills in a bill payment service and alerts user a bill is due.
  • A stock prediction application that continuously pulls down market prices for securities of interest.
  • A publishing system that continuously looks for new items to publish.

All of these scenarios can be abstracted into a set of patterns common to all of them. This set of common patterns is what I have programmed into an open source .net API called “throttled processing”. What you get with the API is a base class you can use to run the two threads mentioned above (the request generator, and the request processor). There is also an extension to the base class that allows you to provide an operation on a service that will be used in the “processing thread”. This means all you really need to implement is the logic to “get requests thread”. 

There is also a class that allows you to start multiple “Processors” at the same time. This means you can essentially just write a few processors (behavior that ties together queries and/or state change acorss multiple systems) and run them all at the same time. There is an example application included with the source code, have a look at that to see an example of this.

The key value provided by this approach is the ability to throttle the rate at which you send requests to a given service. This is extremely important because without it your process could easily be interpreted as a denial of service attack on 3rd party systems. Also you often find processing limitations of 3rd party services, perhaps they can’t process 3 requests at once, so you find you need to throttle down to 2 requests at a time. This control is what you get by using the throttled processing API. And lastly, multi-threaded rate-controlled processing queues are complicated to program - you don’t want to do this if someone else has already spend the time finding the right abstractions. Hopefully this API will save you that time.

Source code and sample application can be found here:

http://throttledprocessing.codeplex.com/


A quick presentation about the web sockets protocol coming with HTML5. Covers the handshake, the javascript clients, a .net server based implementation of a web socket server, and some real life examples.

 


websequencediagrams.com has developed a unique service for generating sequence diagrams based off an intuitive text language. The basic service is free to use, images are watermarked with the company logo at the bottom. For about $100 you can get your own image server with no watermarks and support for advanced diagram features. I’m using the private image server for this posting.

The tool comes with a web based interface where you can enter text and generate diagrams in real time. There is also a javascript API that allows you to embed text in HTML and have the image generated dynamically when the page loads.

Generating Sequence Diagrams

The following text:

User->TopicService: Search(topic, query)
TopicService->Repository: Content Cached?
Repository-->TopicService: Yes (Cache Results) / No
opt Content Not Cached
    TopicService->ContentProvider: Search(query)
    ContentProvider-->TopicService: Search Results
    TopicService->Repository: Cache Results (topic, query)
end
TopicService->User: Search Results

results in the following image:

What I really like about this approach over something like Viso is that you have source text that can actually be tracked via source control. It’s just like any other type of source code. This makes for easy refactoring if operation or service names change.

The use of Loop and Parallel

The following diagram shows the use of a loop and the parallel operation (parallel only available with paid version).

loop Loop Through Keywords
Processor->IndexingService: keyword
parallel {
    IndexingService->Google: query
    IndexingService->Flickr: 
    IndexingService->YouTube: 
}
parallel {
    Google-->IndexingService: results
    Flickr-->IndexingService:
    YouTube-->IndexingService:
}
IndexingService->IndexingService: Index Results
end

Results in the following diagram:

There are a number of output formats including the following:

  • Plain UML
  • Rose
  • QSD
  • Napkin (what’s shown in this post)
  • VS2010
  • MScGen
  • OmegApple
  • Blue Modern
  • Green Earth
  • Round Green

Here is the above diagram in Round Green:

Code Generation Potential

If you can model out the relationships between operations in services, you could potentially auto generate the sequence text. Taking this concept further you can generate the sequence text and HTML that wraps it - generating completely automated sequence documentation.

To do this you need a model that expresses services, the operations they expose, and the mappings between the operations. You also need a way to traverse the graph of mappings between operations. This is the hard part, but once it’s done it can provide a lot of value because you can generate sequence text automatically.

Tool Assessment

Overall this is a great product. The folks behind the service have put a lot of thought into making sequence diagrams extremely easy and functional. I have already started using the javascript API for my own documentation needs, by writing out sequence text in HTML documents. I’d recommend this tool to any software architect / technical analyst / system integrator type person.


Here is a simple demonstration of constructor injection using an IOC container. This demo includes a Unity and Autofac implementation.

Constructor injection means the dependencies (parameters on a constructor) are resolved by looking inside a IOC container (graph of dependencies). This allows us to write classes that simply express the things they are dependent on via the constructor. To increase loose coupling, interface types are commonly used as parameters. 

Consider the following context: a banking service that uses some other services, specifically a account service and a user service. The account and user services use common services for things like logging, caching, auditing, etc. The interfaces for these services might looks something like this:

Now we provide an implementation for each of these services. On the constructor we’ll add interface parameters to other services the class is dependent on. Consider the following mocked account service for example:

We are saying this service requires a IAuditService, a ICacheService, and a ILoggingService. We simply set the instances provided to us as private members of the class and we can then use them throughout the implementation of the class, like so:

We can follow the patterns above for all our services, adding interface parameters to the constructor and setting local member fields. When we are ready to run the program we need to first build up the dependency graph (IOC Container). We have defined an interface for resolving instances out of a container, it looks like this:

As you can see, it only does one thing: resolves types. So it’s up to the class that implements this interface to build up the container in its constructor. Here is a Unity based sample:

And here is a Autofac based sample:

The above graph of dependencies can be visualized like so:

Now we can run the following program to see both Unity and Autofac based approaches to dependency injection via our IIOCContainer interface:

Running the above program shows us both IOC containers are used to do the same thing, provide dependencies to constructors:

As you can see the exact same thing happened in each test, but completely different containers where used in the two tests. Hopefully this demo will give you a good starting point for your loose coupling designs.

The source code for this demo can be found here:

http://iocdemo.codeplex.com/


I’ve been thinking a lot about alerts lately - put together a very simple deck to show how an alert can be sent based off an event.