Enhancing Talk .NET

Talk .NET presents a straightforward way to reinvent the popular instant-messaging application in .NET code. However, as it currently stands, it's best suited for small groups of users and heavily reliant on a central coordination server. In fact, in many respects it's hard to call this a true peer-to-peer application at all.

Fortunately, Talk .NET is just a foundation that you can build on. This section considers possible enhancements, stumbling blocks, and a minor redesign that allows true peer-to-peer communication.

Cleaning Up After Clients

Currently, the system assumes that all clients will log out politely when they've finished using the system. Due to network problems, program error, or some other uncontrollable factor, this may not be the case. Remember, one of the defining characteristics of any peer-to-peer system is that it must take into account the varying, fragile connectivity of users on the Internet. For this reason, Talk .NET needs to adopt a more defensive approach.

Currently, the SendMessage() method raises an unhandled exception if it can't contact the specified user. This exception will propagate back to the user-interface code, where it will be handled and will result in a user error message. The problem with this approach is that the user remains in the server's collection and continues to "appear" online. If another user attempts to send a message to this user, valuable server seconds will be wasted attempting to contact the offline user, thereby raising the exception. This problem will persist until the missing user logs back in to the system.

To account for this problem, users should be removed from the collection if they cannot be contacted. Here's the important portion of the SendMessage() code, revised accordingly:

If Not Recipient Is Nothing Then

    Dim callback As New ReceiveMessageCallback( _
      AddressOf Recipient.ReceiveMessage)

    Try
        callback.BeginInvoke(message, senderAlias, Nothing, Nothing)
    Catch Err As Exception
        ' Client could not be contacted.
        Trace.Write("Message delivery failed")
        ActiveUsers.Remove(recipientAlias)
    End Try

End If

You may also want to send a message explaining the problem to the user. However, you also need to protect yourself in case the user who sent the message can't be contacted or found. To prevent the code from becoming too fragmented, you can rewrite it using recursion, as shown here:

Public Sub SendMessage(ByVal senderAlias As String, _
  ByVal recipientAlias As String, ByVal message As String) _
  Implements TalkComponent.ITalkServer.SendMessage

    Dim Recipient As ITalkClient
    If ActiveUsers.ContainsKey(recipientAlias) Then
        Trace.Write("Recipient '" & recipientAlias & "' found")
        Recipient = CType(ActiveUsers(recipientAlias), ITalkClient)

        If Not Recipient Is Nothing Then

            Trace.Write("Delivering message to '" & recipientAlias & "' from _
                         '" & senderAlias & "'")
            Dim callback As New ReceiveMessageCallback( _
              AddressOf Recipient.ReceiveMessage)

            ' Deliver the message.
            Try
                callback.BeginInvoke(message, senderAlias, Nothing, Nothing)

            Catch Err As Exception
                ' Client could not be contacted.
                ActiveUsers.Remove(recipientAlias)

                If senderAlias <> "Talk .NET"
                   ' Try to send a warning message.
                   message = "'" & message & "' could not be delivered."
                   SendMessage("Talk .NET", senderAlias, message)

            End Try
        End If

    Else
        ' User was not found. Try to find the sender.
        Trace.Write("Recipient '" & recipientAlias & "' not found")
        If senderAlias <> "Talk .NET"
            ' Try to send a warning message.
            message = "'" & message & "' could not be delivered."
            SendMessage("Talk .NET", senderAlias, message)
        End If

    End If

End Sub

Of course, in order for this approach to work, you'll need to ensure that no other user can take the user name "Talk .NET." You could add this restriction in your logon or authentication code.

Toward Decentralization

Talk .NET will always requires some sort of centralized server component in order to store information about logged-on users and their locations. However, it's not necessary to route all communication through the server. In fact, Remoting allows clients to communicate directly—with a few quirks.

Remoting is designed as an object-based networking technology. In order for clients to communicate directly, they need to have a reference to each other's remotable ClientProcess object. As you've already learned, you can create this reference through a configuration file or .NET Remoting code, if you know the appropriate URL. This is how the client contacts the coordination server in the Talk .NET system—by knowing the computer and port where it's located. But there's also another approach: by passing an object reference. The server calls the client back by using one of its stored ITalkClient references.

The ITalkClient reference isn't limited to exchanges between the server and client. In fact, this reference can be passed to any computer on the network. Because ITalkClient references a remotable object (in this case, ClientProcess), whenever the reference travels to another application domain, it actually takes the form of an ObjRef: a network pointer that encapsulates all the information needed to describe the object and its location on the network. With this information, any .NET application can dynamically construct a proxy and communicate with the client it references. You can use the ObjRef as the basis for decentralized communication.

To see this in action, modify the ITalkServer interface to expose an additional method that returns an ITalkClient reference for a specific user:


Public Interface ITalkServer

    ' (Other code omitted.)
    Function GetUser(ByVal [alias] As String) As ITalkClient

End Interface

Now, implement the GetUser() method in the ServerProcess class:

Public Function GetUser(ByVal [alias] As String) As TalkComponent.ITalkClient _
  Implements TalkComponent.ITalkServer.GetUser

    Return ActiveUsers([alias])

End Function

Now the ClientProcess class can call GetUser() to retrieve the ITalkUser reference of the peer it wants to communicate with; it can then call the ITalkClient.ReceiveMessage() method directly:

Public Sub SendMessage(ByVal recipientAlias As String, ByVal message As String)

    Dim Peer As ITalkClient = Server.GetUser(recipientAlias)
    Peer.ReceiveMessage(message, Me.Alias)

End Sub

With this change in place, the system will work exactly the same. However, the coordination server is now simply being used as a repository of connection information. Once the lookup is performed, it's no longer required.

Note

You can find this version of the application in the Talk .NET Decentralized directory with the online samples for this chapter.

Which approach is best? There's little doubt that the second choice is more authentically peer-to-peer. But the best choice for your system depends on your needs. Some of the benefits of the server-focused approach include the following:

The server can track system activity, which could be useful, depending on your reporting needs. If you run the second version of this application, you'll see that the server trace log reflects when users are added or removed, but it doesn't contain any information when messages are sent.
The connectivity is likely to be better. Typically, if a client can contact the server, the server will be able to call the client. However, two arbitrary clients may not be able to interact, depending on firewalls and other aspects of network topology.
The server can offer some special features that wouldn't be possible in a decentralized system, such as multiuser broadcasts that involve thousands of users.

On the other hand, the benefits of the decentralized approach include the following:

The server has no ability to monitor conversations. This translates into better security (assuming peers don't fully trust the behavior of the server).
The possibility for a server bottleneck decreases. This is because the server isn't called on to deal with messages, but rather, only to provide client lookup, thereby reducing its burden and moving network traffic out to the edges of the network.

Most peer-to-peer supporters would prefer the decentralized approach. However, the current generation of instant-messaging applications avoid it for connectivity reasons. Instead, they use systems that more closely resemble the client-server model.

In some cases you might want to adopt a blended approach that makes use of both of these techniques. One option is to allow the client to specify the behavior through a configuration setting. Another option would be to use peer-to-peer communication only when large amounts of data need to be transmitted. This is the approach used in the next section to provide a file transfer service for Talk .NET.

In any case, if you adopt the decentralized approach, you can further reduce the burden on the central coordinator by performing the client lookup once, and then reusing the connection information for all subsequent messages. For example, you could cache the retrieved client reference in a local ActiveUsers collection, and update it from the server if an error is encountered while sending a message. Or, you might modify the system so that the GetUsers() method returns the entire collection, complete with user names and ITalkClient network pointers. The central coordinator would then simply need to support continuous requests to three methods: AddUser(), RemoveUser(), and GetUsers(). This type of design works well if you use "buddy lists" to determine who a user can communicate with. That way, users will only retrieve information about a small subset of the total number of users when they call GetUsers().

Adding a File Transfer Feature

Using the decentralized approach, it's easy to implement a file transfer feature that's similar to the one provided by Microsoft's Windows Messenger. This feature wouldn't be practical with the centralized approach because it encourages the server to become a bottleneck. Although transferring files isn't a complex task, it can take time, and the CLR only provides a limited number of threads to handle server requests. If all the threads are tied up with sending data across the network (or waiting as data is transferred over a low-bandwidth connection), subsequent requests will have to wait—and could even time out.

The file transfer operation can be broken down into four steps:

Peer A offers a file to Peer B.
Peer B accepts the file offer and initiates the transfer.
Peer A sends the file to Peer B.
Peer B saves the file locally in a predetermined directory.

These steps require several separate method calls. Typically, in step 2, the user will be presented with some sort of dialog box asking whether the file should be transferred. It's impractical to leave the connection open while this message is being displayed because there's no guarantee the user will reply promptly, and the connection could time out while waiting. Instead, the peer-to-peer model requires a looser, disconnected architecture that completely separates the file offer and file transfer.

The first step needed to implement the file transfer is to redefine the ITalkClient interface. It's at this point that most of the coding and design decisions are made.

Public Interface ITalkClient

    ' (Other code omitted.)
    Sub ReceiveFileOffer(ByVal filename As String, _
      ByVal fileIdentifier As Guid, ByVal senderAlias As String)
    Function TransferFile(ByVal fileIdentifier As Guid, _
      ByVal senderAlias As String) As Byte()

End Interface

You'll notice that both methods use a globally unique identifier (GUID) to identify the file. There are several reasons for this approach, all of which revolve around security. If the TransferFile() method accepted a full file name, it would be possible for the client to initiate a transfer even if the file had not been offered, thereby compromising data security. To circumvent this problem, all files are identified uniquely. The identifier used is a GUID, which guarantees that a client won't be able to guess the identifier for a file offered to another user. Also, because GUIDs are guaranteed to be unique, a peer can offer multiple files to different users without confusion. More elaborate security approaches are possible, but this approach is a quick and easy way to prevent users from getting ahold of the wrong files.

The file itself is transferred as a large byte array. While this will be sufficient in most cases, if you want to control how the data is streamed over the network, you'll need to use a lower-level networking class, such as the ones described in the second part of this book.

Once the ITalkClient interface is updated, you can begin to revise the ClientProcess class. The first step is to define a Hashtable collection that can track all the outstanding file offers since the application was started:

Private OfferedFiles As New Hashtable()

To offer a file, the TalkClient calls the public SendFileOffer() method. This method looks up the client reference, generates a new GUID to identify the file, stores the information, and sends the offer.

Public Function SendFileOffer(ByVal recipientAlias As String, _
  ByVal sourcePath As String)

    ' Retrieve the reference to the other user.
    Dim peer As ITalkClient = Server.GetUser(recipientAlias)

    ' Create a GUID to identify the file, and add it to the collection.
    Dim fileIdentifier As Guid = Guid.NewGuid()
    OfferedFiles(fileIdentifier) = sourcePath
    ' Offer the file.
    peer.ReceiveFileOffer(Path.GetFileName(sourcePath), fileIdentifier, Me.Alias)

End Function

Notice that only the file name is transmitted, not the full file path. The full file path is stored for future reference in the Hashtable collection, but it's snipped out of the offer using the Path class from the System.IO namespace. This extra step is designed to prevent the recipient from knowing where the offered file is stored on the offering peer.

Tip

Currently, the TalkClient doesn't go to any extra work to "expire" an offered file and remove its information from the collection if it isn't transferred within a set period of time. This task could be accomplished using a separate thread that would periodically examine the collection. However, because the in-memory size of the OfferedFiles collection will always remain relatively small, this isn't a concern, even after making a few hundred unclaimed file offers.

The file offer is received by the destination peer with the ReceiveFileOffer() method. When this method is triggered, the ClientProcess class raises a local event to alert the user:

Event FileOfferReceived(ByVal sender As Object, _
  ByVal e As FileOfferReceivedEventArgs)

Private Sub ReceiveFileOffer(ByVal filename As String, _
  ByVal fileIdentifier As System.Guid, ByVal senderAlias As String) _
  Implements TalkComponent.ITalkClient.ReceiveFileOffer

    RaiseEvent FileOfferReceived(Me, _
      New FileOfferReceivedEventArgs(filename, fileIdentifier, senderAlias))

End Sub

The FileOfferReceivedEventArgs class simply provides the file name, file identifier, and sender's name:

Public Class FileOfferReceivedEventArgs
    Inherits EventArgs
    Public Filename As String
    Public FileIdentifier As Guid
    Public SenderAlias As String

    Public Sub New(ByVal filename As String, ByVal fileIdentifier As Guid, _
      ByVal senderAlias As String)
        Me.Filename = filename
        Me.FileIdentifier = fileIdentifier
        Me.SenderAlias = senderAlias
    End Sub

End Class

The event is handled in the form code, which will then ask the user whether the transfer should be accepted. If it is, the next step is to call the ClientProcess.AcceptFile() method, which initiates the transfer.

Public Sub AcceptFile(ByVal recipientAlias As String, _
  ByVal fileIdentifier As Guid, ByVal destinationPath As String)

    ' Retrieve the reference to the other user.
    Dim peer As ITalkClient = Server.GetUser(recipientAlias)

    ' Create an array to store the data.
    Dim FileData As Byte()

    ' Request the file.
    FileData = peer.TransferFile(fileIdentifier, Me.Alias)
    Dim fs As FileStream

    ' Create the local copy of the file in the desired location.
    ' Warning: This method doesn't bother to check if it's overwriting
    ' a file with the same name.
    fs = File.Create(destinationPath)
    fs.Write(FileData, 0, FileData.Length)

    ' Clean up.
    fs.Close()

End Sub

There are several interesting details in this code:

It doesn't specify the destination file path and file name. This information is supplied to the AcceptFile() method through the destinationPath parameter. This allows the form code to stay in control, perhaps using a default directory or prompting the user for a destination path.
It includes no exception-handling code. The assumption is that the form code will handle any errors that occur and inform the user accordingly.
It doesn't worry about overwriting any file that may already exist at the specified directory with the same name. Once again, this is for the form code to check. It will prompt the user before starting the file transfer.

The peer offering the file sends it over the network in its TransferFile() method, which is in many ways a mirror image of AcceptFile().

Private Function TransferFile(ByVal fileIdentifier As System.Guid, _
  ByVal senderAlias As String) As Byte() _
  Implements TalkComponent.ITalkClient.TransferFile

    ' Ensure that the GUID corresponds to a valid file offer.
    If Not OfferedFiles.Contains(fileIdentifier) Then
        Throw New ApplicationException( _
          "This file is no longer available from the client.")
    End If

    ' Look up the file path from the OfferedFiles collection and open it.
    Dim fs As FileStream
    fs = File.Open(OfferedFiles(fileIdentifier), FileMode.Open)

    ' Fill the FileData byte array with the data from the file.
    Dim FileData As Byte()
    ReDim FileData(fs.Length)
    fs.Read(FileData, 0, FileData.Length)

    ' Remove the offered file from the collection.
    OfferedFiles.Remove(fileIdentifier)

    ' Clean up.
    fs.Close()
    ' Transmit the file data.
    Return FileData

End Function

The only detail we haven't explored is the layer of user-interface code in the Talk form. The first step is to add an "Offer File" button that allows the user to choose a file to send. The file is chosen using the OpenFileDialog class.

Private Sub cmdOffer_Click(ByVal sender As System.Object, _
  ByVal e As System.EventArgs) Handles cmdOffer.Click

    ' Prompt the user for a file to offer.
    Dim dlgOpen As New OpenFileDialog()
    dlgOpen.Title = "Choose a File to Transmit"

    If dlgOpen.ShowDialog() = DialogResult.OK Then
        Try

            ' Send the offer.
            TalkClient.SendFileOffer(lstUsers.Text, dlgOpen.FileName)
        Catch Err As Exception
            MessageBox.Show(Err.Message, "Send Failed", _
                              MessageBoxButtons.OK, MessageBoxIcon.Exclamation)
        End Try
    End If

End Sub

The Talk form code also handles the FileOfferReceived event, prompts the user, and initiates the transfer if accepted (see Figure 4-7).

Figure 4-7: Offering a file transfer


Private Sub TalkClient_FileOfferReceived(ByVal sender As Object, _
  ByVal e As TalkClient.FileOfferReceivedEventArgs) _
  Handles TalkClient.FileOfferReceived

    ' Create the user message describing the file offer.
    Dim Message As String
    Message = e.SenderAlias & " has offered to transmit the file named: "
    Message &= e.Filename & Environment.NewLine
    Message &= Environment.NewLine & "Do You Accept?"

    ' Prompt the user.
    Dim Result As DialogResult = MessageBox.Show(Message, _
      "File Transfer Offered", MessageBoxButtons.YesNo, MessageBoxIcon.Question)

    If Result = DialogResult.Yes Then

        Try
            ' The code defaults to the TEMP directory, although a more
            ' likely option would be to read information from a registry or
            ' configuration file setting.
            Dim DestinationPath As String = "C:\TEMP\" & e.Filename

            ' Receive the file.
            TalkClient.AcceptFile(e.SenderAlias, e.FileIdentifier, _
                                    DestinationPath)

            ' Assuming no error occurred, display information about it
            ' in the chat window.
            txtReceived.Text &= "File From: " & e.SenderAlias
            txtReceived.Text &= " transferred at "
            txtReceived.Text &= DateTime.Now.ToShortTimeString()
            txtReceived.Text &= Environment.NewLine & DestinationPath
            txtReceived.Text &= Environment.NewLine & Environment.NewLine

        Catch Err As Exception
            MessageBox.Show(Err.Message, "Transfer Failed", _
                             MessageBoxButtons.OK, MessageBoxIcon.Exclamation)
        End Try

    End If

End Sub

Figure 4-8: A completed file transfer

Note

Adding a file transfer feature such as this one is a notorious security risk. Because the communication is direct, there's no way to authenticate the recipient. (A central server, on the other hand, could verify that users are who they claim to be.) That means that a file could be offered to the wrong user or a malicious user who is impersonating another user. To reduce the risk, the server component could require user ID and password information before returning any information from the GetUsers() collection.We'll deal with security more closely in Chapter 11.

Scalability Challenges with the Simple Implementation

In its current form, the Talk .NET application is hard pressed to scale in order to serve a large audience. The key problem is the server component, which could become a critical bottleneck as the traffic increases. To reduce this problem, you can switch to the decentralized approach described earlier, although this is only a partial solution. It won't deal with the possible problems that can occur if the number of users grows so large that storing them in an in-memory hashtable is no longer effective.

Databases and a Stateless Server

To combat this problem, you would need to store the list of logged-on users and their connection information in an external data store such as a database. This would reduce the performance for individual calls (because they would require database lookups), but it would increase the overall scalability of the system (because the memory overhead would be lessened).

This approach also allows you to create a completely stateless coordination server. In this case, you could replace your coordination server by a web farm of computers, each of which would access the same database. Each client request could be routed to the computer with the least traffic, guaranteeing performance. Much of the threading code presented in the next chapter would not be needed anymore, because all of the information would be shared in a common database that would provide its own concurrency control. In order to create the cluster farm and expose it under a single IP, you would need to use hardware clustering or a software load-balancing solution such as Microsoft's Application Center. All in all, this is a fairly good idea of how a system such as Microsoft's Windows Messenger works. It's also similar to the approach followed in the third part of this book, where you'll learn how to create a discovery server using a web service.

OneWay Methods

There is also a minor messaging enhancement you can implement using the OneWay attribute from the System.Runtime.Remoting.Messaging namespace. When you apply this attribute to a method, you indicate that, when this method is called remotely, the caller will disconnect immediately without waiting for the call to complete. This means that the method cannot return a result or modify a ByVal parameter. It also means that any exception thrown in the method will not be detected by the caller. The advantage of this approach is that it eliminates waiting. In the Talk .NET system, the coordination server automatically calls a client if a message cannot be delivered. Thus, there's no reason for the client to wait while the message is actually being delivered.

There are currently two methods that could benefit from the OneWay attribute: ClientProcess.ReceiveMessage() and ServerProcess.SendMessage(). Here's an example:

<System.Runtime.Remoting.Messaging.OneWay()> _
Private Sub ReceiveMessage(ByVal message As String, _
  ByVal senderAlias As String) Implements ITalkClient.ReceiveMessage
    ' (Code omitted.)
End Sub

Note that there's one reason you might not want to apply the OneWay attribute to ServerProcess.SendMessage(). If you do, you won't be able to detect an error that might result if the user has disconnected without logging off correctly. Without catching this error, it's impossible to detect the problem, notify the sender, and remove the user from the client collection. This error-handling approach is implemented in the next chapter.

Optional Features

Finally, there are a number of optional features that you can add to Talk .NET. These include variable user status, user authentication with a password, and buddy lists. The last of these is probably the most useful, because it allows you to limit the user list information. With buddy lists, users only see the names of the users that they want to contact. However, buddy lists must be stored on the server permanently, and so can't be held in memory. Instead, this information would probably need to be stored in a server-side database.

Another option would be to store a list on the local computer, which would then be submitted with the login request. This would help keep the system decentralized, but it would also allow the information to be easily lost, and make it difficult for users to obtain location transparency and use the same buddy list from multiple computers. As you'll see, users aren't always prepared to accept the limitations of decentralized peer-to-peer applications.

Firewalls, Ports, and Other Issues

Remoting does not provide any way to overcome some of the difficulties that are inherent with networking on the Internet. For example, firewalls, depending on their settings, can prevent communication between the clients and the coordination server. On a local network, this won't pose a problem. On the Internet, you can lessen the possibility of problems by following several steps:

Use the centralized design in which all communication is routed through the coordination server.
Make sure the coordination server is not behind a firewall (in a company network, you would place the coordination server in the demilitarized zone, or DMZ). This helps connectivity because often communication will succeed when the client is behind a firewall, but not when both the client and server are behind firewalls.
Change the configuration files so that HTTP channels are used instead. They're typically more reliable over the Internet and low-bandwidth connections. You should still use binary formatting, however, unless you're trying to interoperate with non-.NET clients.

It often seems that developers and network administrators are locked in an endless battle, with developers trying to extend the scope of their applications while network administrators try to protect the integrity of their network. This battle has escalated to such a high point that developers tout new features such as .NET web services because they use HTTP and can communicate through a firewall. All this ignores the fact that, typically, the firewall is there to prevent exactly this type of communication. Thwarting this protection just means that firewall vendors will need to go to greater lengths building intelligence into their firewall products. They'll need to perform more intensive network analysis that might reject SOAP messages or deny web-service communication based on other recognizable factors. These changes, in turn, raise the cost of the required servers and impose additional overhead.

In short, it's best to deal with firewall problems by configuring the firewall. If your application needs to use a special port, convince the network administrators to open it. Similarly, using port 80 for a peer-to-peer application is sure to win the contempt of system administrators everywhere. If you can't ensure that your clients can use another port, you may need to resort to this sleight-of-hand, but it's best to avoid the escalating war of Internet connectivity altogether.

Note

Ports are generally divided into three groups: well-known ports (0–1023), registered ports (1024–49151), and dynamic ports (49152–65535). Historically, well-known ports have been used for server-based applications such as web servers (80), FTP (20), and POP3 mail transfer (110). In your application, you would probably do best to use a registered or dynamic port that isn't frequently used. These are less likely to cause a conflict (although more likely to be blocked by a firewall). For example, 6346 is most commonly used by Gnutella. For a list of frequently registered ports, refer to the C:\{WinDir]\System32\Drivers\Etc\Services file or the http://www.iana.org/assignments/port-numbers site.

Remoting and Network Address Translation

.NET Remoting, like many types of distributed communication, is challenged by firewalls, proxy servers, and network address translation (NAT). Many programmers (and programming authors) assume that using an HTTP channel will solve these problems. It may—if the intervening firewall restricts packets solely based on whether they contain binary information. However, this won't solve a much more significant problem: Most firewalls allow outgoing connections but prevent all incoming ones. Proxy servers and NAT devices work in the same way. This is a significant limitation. It means that a Talk .NET peer can contact the server (and the server can respond), but the server cannot call back to the client to deliver a message.

There's more than one way to solve this problem, but none is easy (or ideal). You could implement a polling mechanism, whereby every client periodically connects to the server and asks for any unsent messages. The drawback of this approach is that the message latency will be increased, and the load on the server will rise dramatically with the number of clients.

Another approach is to use some sort of bidirectional communication method. For example, you might want to maintain a connection and allow the server to fire its event or callback at any time using the existing connection. This also reduces the number of simultaneous clients the server can handle, and it requires a specially modified type of Remoting channel. Ingo Rammer has developed one such channel, and it's available at http://www.dotnetremoting.cc/projects/modules/BidirectionalTcpChannel.asp. However, this bidirectional channel isn't yet optimized for a production environment, so enterprise developers will need to wait.

Unfortunately, neither of these two proposed solutions will work if you want to use decentralized communication in which peers contact each other directly. In this case, you'll either need to write a significant amount of painful low-level networking code (which is beyond the scope of this book), or use a third-party platform such as those discussed in Part Three.