Yudong Li: 2010

Monday, November 15, 2010

WstxIOException in WebLogic

com.ctc.wstx.exc.WstxIOException:
Tried all: '1' addresses, but could not connect over HTTP to server: 'java.sun.com', port: '80'
at com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:683)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)
at weblogic.servlet.internal.TldCacheHelper$TldIOHelper.parseXML(TldCacheHelper.java:134)
at weblogic.descriptor.DescriptorCache.parseXML(DescriptorCache.java:380)
at weblogic.servlet.internal.TldCacheHelper.parseTagLibraries(TldCacheHelper.java:65)

Truncated. see log file for complete stacktrace

java.net.ConnectException:
Tried all: '1' addresses, but could not connect over HTTP to server: 'java.sun.com', port: '80'
at weblogic.net.http.HttpClient.openServer(HttpClient.java:312)
at weblogic.net.http.HttpClient.openServer(HttpClient.java:388)
at weblogic.net.http.HttpClient.New(HttpClient.java:238)
at weblogic.net.http.HttpURLConnection.connect(HttpURLConnection.java:172)
at weblogic.net.http.HttpURLConnection.getInputStream(HttpURLConnection.java:356)

Truncated. see log file for complete stacktrace

Sometimes, you may encounter problems like this which slows the deployment process.

One option to workaround this issue is add

"-Djavax.xml.stream.XMLInputFactory=weblogic.xml.stax.XMLStreamInputFactory"

to your WebLogic start script. It will prevents WebLogic to fetch any remote xml definition files.

Monday, October 4, 2010

Exception Handling Principles

1. System.out.println is expensive. These calls are synchronized for the duration of disk I/O, which

significantly slows throughput.

2. By default, stack traces are logged to the console. But browsing the console for an exception trace isn't feasible in

a production system.

3. In addition, they aren't guaranteed to show up in the production system, because system administrators can map

System.out and System.errs to ' ' [>nul] on NT and dev/nul on UNIX. Moreover, if you're running the

J2EE app server as an NT service, you won't even have a console.

4. Even if you redirect the console log to an output file, chances are that the file will be overwritten when the

production J2EE app servers are restarted.

5. Using System.out.println during testing and then removing them before production isn't an elegant solution

either, because doing so means your production code will not function the same as your test code.

6. If you can't handle an exception, don't catch it.

7. Catch an exception as close as possible to its source.

8. If you catch an exception, don't swallow it.

9. Log an exception where you catch it, unless you plan to re-throw it.

10. Preserve the stack trace when you re-throw the exception by wrapping the original exception in the new one.

11. Use as many typed exceptions as you need, particularly for application exceptions. Do not just use

java.lang.Exception every time you need to declare a throws clause. By fine graining the throws clause, it is self-

documenting and becomes evident to the caller that different exceptions have to be handled.

12. If you programming application logic, use unchecked exceptions to indicate an error from which the user cannot

recover. If you are creating third party libraries to be used by other developers, use checked exceptions for

unrecoverable errors too.

13. Never throw unchecked exceptions in your methods just because it clutters the method signature. There are some

scenarios where this is good (For e.g. EJB Interface/Implementations, where unchecked exceptions alter the bean

behavior in terms of transaction commit and rollback), but otherwise this is not a good practice.

14. Throw Application Exceptions as Unchecked Exceptions and Unrecoverable System exceptions as unchecked

exceptions.

15. Structure your methods according to how fine-grained your exception handling must be.

Wednesday, August 4, 2010

Develop a simple Maven plugin

While this is a really old topic, the process to develop a maven plugin still need some time to sort our especially for people who are not familiar with Maven, like me. So here is a very short introduction on how to develop a maven plugin and how to integrate it with your existing application.

Prerequisite: of course, you need Maven, and all other things will totally depend on your wish. For me, I use Eclipse with m2eclipse plugin which saves me some time to create the archetype of Maven plugin project. However, this is a rather simple process and everyone can do it in command line with only a few typing.

1. Create a new Maven plugin project in Eclipse. Here, we name the project groupid: Featheast, artifactid: maven-test-plugin. Please pay attention to the naming convention of artifactid which we will use a little bit later.

2. Under the src directory, create a new Class naming MyMojo which extends AbstractMojo. AbstractMojo is the class you must inherit for plugin to work, and the only method you have to implement is execute(), where the real business will happens.

3. In order for other project to recognize your plugin function, you have to specify what the Mojo does. In Maven, this is accomplished by add an annotation @goal for the class in the comment of class. This goal name will be used later for other projects to reference this function.

4. You can create any number of variables in the class which acts like a parameter for later process. Consider the whole class as a function, then this variables will be the arguments you passed in. For each variable, another annotation @parameter will be used to specify how to like the variable to external usage.

5. Set the real business logic in the execute() function. You can use the Maven log to print out or debug for your convenience. An internal method getLog() is always there for you to do so and the usage of it is quite similar to the log4j.

/**

*

* @author yudong

*

* @goal realmojo

*/

public class MyRealMojo extends AbstractMojo{

/**

* @parameter expression="${mymojo.username}"

*/

private String username;

/**

* @parameter expression="${mymojo.password}"

*/

private String password;

public void execute() throws MojoExecutionException, MojoFailureException {

if(password.length()<10){

getLog().info("Hey" + username +", your password is too short!");

}else{

getLog().info("Congratulations " + username + ", your password is all right!");

}

Set set = getPluginContext().keySet();

getLog().info("The context include " + set.size() + " entries");

Iterator iterator = set.iterator();

while(iterator.hasNext()){

Object key = iterator.next();

getLog().info(key.toString() + " : " + getPluginContext().get(key));

}

}

}

6. In the pom.xml, add any dependency you need, then add the maven-plugin-plugin to build the plugin. Specify any goals that you want to be included in the build output.

<build>

<plugins>

<plugin>

<groupId>org.apache.maven.plugins</groupId>

<artifactId>maven-plugin-plugin</artifactId>

<version>2.5.1</version>

<configuration>

<goalPrefix>Plugin.Test</goalPrefix>

<username>Featheast</username>

</configuration>

<goals>

<goal>

realmojo

</goal>

</goals>

</plugin>

</plugins>

</build>

7. Now your Maven plugin is created, build it with standard command: mvn install.

8. Create another project to use this plugin. Add the plugin configuration in the pom.xml, and specify the goal.

<build>

<plugins>

<plugin>

<groupId>Featheast</groupId>

<artifactId>maven-test-plugin</artifactId>

<version>0.0.1-SNAPSHOT</version>

<executions>

<execution>

<phase>install</phase>

<goals>

<goal>realmojo</goal>

</goals>

<configuration>

<username>This is the usernmae</username>

</configuration>

</execution>

</executions>

</plugin>

</plugins>

</build>

9. You could specify the parameters in the configuration tag, or you can add -Dusername = XXX in the command line to pass in the parameters.

10. Finally what you did will be print on the console, once you build the new project.

Monday, August 2, 2010

Pre-EJB 3.0 Enterprise Bean

An enterprise bean is a server-side software component that can be deployed in a distributed multi-tiered environment, and it will remain that way going forward. Anyone who has worked with Enterprise JavaBeans technology before knows that there are three types of beans - session beans, entity beans, and message-driven beans. Historically an EJB component implementation has neven been contained in a single source file; a number of files work together to make up an implementation of an enterprise bean. Let us briefly go through these EJB implementation artifacts:

1) Enterprise bean class

The primary part of the bean used to be the implementation itself - which contained the guts of your logic - called the enterprise bean class. This was simply a Java class that conformed to a well-defined interface and obeyed certain rules. For instance, the EJB specification defined a few standard interfaces forced your bean class had to implement. Implementing these interfaces forced your bean class to expose certain methods that all bean must provide, as defined by the EJB component model. The EJB container called these required methods to manage your bean and alert your bean to significant events. The most basic interface that all of the session, entity and message-driven bean classes implemented is the javax.ejb.EnterpriseBean interface. This interface served as a marker interface, meaning that implementing this interface indicated that your class was indeed an enterprise bean class. Session beans, entity beans, and message-driven beans each had more specific interfaces that extended the component interface javax.ejb.EnterpriseBean, viz. javax.ejb.SessionBean, javax.ejb.EntityBean, and javax.ejb.MessageDrivenBean.

2) EJB Object

When a client wants to use an instance of an enterprise bean class, the client never invokes the method directly on an actual bean instance. Rather, the invocation is intercepted by the EJB container and then delegated to the bean instance. By intercepting requests, the EJB container can provide middleware services implicitly. Thus, the EJB container acted as a layer of indirection between the client code and the bean. This layer of indirection manifested itself as a single network-aware object called the EJB object. The container would generate the implementation of javax.ejb.EJBObject or javax.ejb.EJBLocalObject, depending on whether the bean was local or remote, that is whether it supported local or remote clients, at deployment time.

3) Remote interface

A remote interface, written by the bean provider, consisted of all the methods that were made available to the remote client of the bean.These methods usually would be business methods that the bean provider wants the remote clients of the bean to use. Remote interfaces had to comply with special rules that EJB specification defined. For example, all remote interfaces must be derived from the javax.ejb.EJBObject interface. The EJB object interface consisted of a number of methods, and the container would implement them for you.

4) Local interface

The local interface, written by the bean provider, consisted of all the methods that were made available to the local clients of the bean. Akin to the remote interface, the local interface provided business methods that the local bean clients could call. The local interface provided an efficient mechanism to enable use of EJB objects within the Java Virtual MAchine, without incurring the overhead of RMI-IIOP. An enterprise bean that expected to be used by remote as well as local clients had to support both local and remote interfaces.

5) Home interface

Home interfaces defined methods for creating, destroying, and finding local or remote EJB objects. They acted as life cycle interfaces for the EJB objects. Each bean was supposed to have a corresponding home interface. All home interfaces had to extend standard interface javax.ejb.EJBHome or javax.ejb.EJBLocalHome, depending on whether the enterprise bean was local or remote. The container generated home objects implementing the methods of this interface at the time of deployment. Clients acquired references to the EJB objects via these home objects. Even though the container implemented home interfaces as home objects, an EJB developer was still required to follow certain rules pertaining to the life-cycle methods of a home interface. For instance, for each createXXX() method in the home interface, the enterprise bean class was required to have a corresponding ejbCreateXXX() method.

6) Deployment descriptor

To inform the container about your middleware needs, you as a bean provider were required to declare your component' middleware needs - such as life-cycle management, transaction control, security services, and so on - in an XML-based deployment descriptor file. The container inspected the deployment descriptor and fulfilled the requirements laid out by you. The deployment descriptor thus played the key role in enabling implicit middleware services in the EJB framework.

7) Vendor-specific files

Since all EJB server vendors are different, they each have some proprietary value-added features. The EJB specification did not touch these features, such as how to configure load balancing, clustering, monitoring, and so on. Therefore, each EJB server vendor required you to include additional files specific to that vendor, such as a vendor specific XML or text-based deployment descriptor that the container would inspect to provide vendor-specific middleware services.

8) The Ejb-jar file

The Ejb-jar file, the packaging artifact, consisted of all the other implementation artifacts of your bean. Once you generated your bean classes, your home interfaces, your remote interfaces, and your deployment descriptor, you'd package them into an Ejb-jar file. It is this Ejb-jar file that you, as a bean provider, would pass around for deployment purpose to application assembles.

Sunday, August 1, 2010

Maven tips

1) Try to ensure there is no duplicates or different versions of dependencies in a project, which will lead errors or conflicts later on.

2) If only want dependencies to exist during the compile phase and then be removed, the scope of such dependency should be set to PROVIDED. PROVIDED scope is not transitive, and the dependencies is supposed to be provided by JDK or container.

3) Use mvn dependency:tree to display the dependencies structure of the whole project, and try to pipeline the output to a file will be more easily to be observed.

4) If you are sitting behind a firewall, set proxy configurations in settings.xml under your .m2 directory.

More to be continued.

Monday, July 12, 2010

Pipe or Redirect within Java Command

it's common knowledge to use Runtime.getRuntime().exec(command) to execute any Unix command or Windows command in a Java application. However, when you try to include the pipe '|' or redirect '>' in the command to alter any output pattern, most of the time the Java will not interpret your command as expected which will turn out to be an error finally. For example, when I tried to run ffmpeg command to encode any video and would like to capture those outputs into a log file, an error of "Unable to find a suitable output format for '>'" will appear.

In order to make Java "understand" out purpose, you cannot directly insert the usual command into the exec() parameter. There is a workaround which will solve the issue.

Construct an array:

String[] commands = {
"/bin/sh",
"-c",
YOUR REAL COMMAND HERE
}

and pass the commands as argument to the Runtime.getRuntime().exec(commands). In this way, the Java environment will make a sh (YOU COULD USE BASH) environment to execute your command, which will take the pipe and redirect into consideration.

Thursday, July 8, 2010

Seven Java EE Performance Problems

1) Slow-running applications

2) Applications that degrade over tim

3) Slow memory leaks that gradually degrade performance

4) Huge memory leaks that crash the application server

5) Periodic CPU spikes and application freezes

6) Applications that behave significantly differently under a heavy load under normal usage patterns

7) Problems or anomalies that occur in production but cannot be reproduced in a test environment

Wednesday, July 7, 2010

Long IP Address

Most of the time, people will think of an IP adress as a String. Especially in Java, most of the time, developers will deal with IP address either with URL or String class. However, representing an IP adress with String has several disadvantages. First a String usually takes more memory comparing to the "same" value int or long, and it will be difficult to compare with other IP addresses with a String. And more importantly, it will not be possible for people to easily determine whether an IP address is in the range of another two IP addresses.

Since IP addresses (IPv4) are composed by four integers ranging from 0 - 255, it will be obviously easy to convert an String IP address to an numerical form which can also uniquely represent the IP address. That's how Long IP address emerges.

A simple method to convert the String IP address to a Long IP address (A.B.C.D) would be:

256*256*256*A + 256*256*B + 256*C + D

Using Long IP address will be helpful in certain scenarios, one is when you dealing with IP-Location mapping in Google App Engine. A very popular data called GeoIP created by MaxMind is heavily used in a lot of different projects. However, when parsing the IP addresses, what they have done in the Java library is first transform the IP String into an InetAddress, and then using getAddress() to get its byte[], and finally get the Long value. There will be no problem when you using this library in other platforms. But in Google App Engine, things will got stuck because of the InetAddress is on the black-list of GAE, which means you will not be able to play with this class. The workaround here would be using the converting method above, you can directly get the Long value which is what they have calculated all the way along.

There might be some other places Long IP addresses is useful, especially when dealing with range query of IP addresses.

Tuesday, June 15, 2010

tmpreaper and ctime, atime, and mtime in Ubuntu

There is a package in Ubuntu can be used to clean directories with those files older than a certain period of time. Before we get into that, let's first clarify three terms related to times in Ubuntu: ctime, atime and mtime.

ctime is the creation time of the file. Say I created this file at Wed Jun 16, 9:45:15, 2010, then this time spot will be exactly the ctime of this file.

atime is the access time of the file. Displaying the file contents or executing the file script will update the atime.

mtime is the modify(ication) time which will be updated when the actual content of the file is modified.

Back to the tmpreaper command, since it is not default installed into the Ubuntu, you have to sudo apt-get install to get the latest version of tmpreaper.

It is a simple command tmpreaper TIME-FORMAT DIRS to invoke the function to do the clean job for you.

TIME-FORMAT is a parameter that specifies the duration of the file which has not been accessed. By default, the time here is about atime. So even if you modify the content in a later stage but does not access the file, the file might still be deleted. Of course, you can enforce the command to run in terms of mtime which you have to append --mtime to the original command.

While the DIRS is the directory you would like to invoke this function, such as /tmp. Never try to do such a thing on the root directory or you may encounter a disaster.

If you have to manually run the command every time, then there is no sense to use this. While the power strengthens with combining another tool CronTab.

CronTab is used to create cron job to run specific script in a period of time. In order to run the cron job, all you need to do is write a script which include the command we have talked previously, then edit the configuration file of CronTab, then the scripts will run as you required in the background.

To edit the configuration file, simply run sudo crontab -e, add an entry in to the file. The format of the file is m h dom mon dow command, the first five sections are divided by space, and you can use asterisk to specify anytime like a wildcard.
Fox example, * * * * * /XXX.bash will run every minute. More usage can be seen from the documentations.

Sunday, June 13, 2010

Handling Azure Large File Upload

In Azure storage, files smaller than 64MB can be directly stored as a single blob into the storage. However, when you want to store a file larger than 64MB, things will become a little bit complicated. The way to accomplish this task is to use the block list service.

Block, unlike blob, is a small unit of file which can be aggregated as a list to form a large file, with each of the small chunk to have a limit of 4MB. For example, say if you have a 100MB file which you want to store into Azure, you have to manually split the files into at least 25 pieces, and then using the put block & put block list operation to upload all the 25 items. More details are listed below:

1) Split large files: this can be done in various ways, via existing tools or write your simple code. Pay attention to write down those file names and make them in the sequence you split them.

2) Put Block: Each of the pieces created last step is called a block, and the second will upload each block one by one into the storage via Put Block operation. The basic process is no difference with other methods, however, one thing need to pay attention is the blockid is a required parameter and all blockids of the blocks must be the same size. In our example, you can have a Base64 blockid with arbitrary length less than 64, but you have to enforce all of the 25 items to have the same length. If not, a 400 exception, or The specified blob or block content is invalid error message will be returned.

3) Put Block List: The last but not the least step is to notify the server that all pieces are uploaded and now it's your job to combine them altogether.

After the three steps, you will be able to upload any size files into the Azure storage.

Tuesday, May 25, 2010

How to tell the version of your Mac OS

Step 1: Click the apple icon in the upper left corner of the screen

Step 2: Choose About this Mac

Step 3: The label right below the Apple indicates the version of this Mac OS, e.g. Version 10.6.2

Step 4: Right to the Processor label is the version of your CPU, from which we can tell the bits of the CPU

Here is a table to clarify the relationships:

Processor Name	32- or 64-bit
Intel Core Solo	32 bit
Intel Core Duo	32 bit
Intel Core 2 Duo	64 bit
Intel Quad-Core Xeon	64 bit
Dual-Core Intel Xeon	64 bit
Quad-Core Intel Xeon	64 bit

Ref : http://support.apple.com/kb/ht3696

Saturday, May 15, 2010

Redirect vs Forward

In the context of web programming, a lot of ambiguity existed between Redirect and Forward. Here is a short summary of the differences between these two terms.

1) Forward can only be direct to an internal page, however Redirect can be used both to an internal page and external page.

2) Forward is much faster than Redirect

3) With Forward the browser is unaware of what happens, and the URL address remains the original link, while Redirect will initiate a new request that the browser will update its address to the new link.

4) As a result of the browser awareness, the refresh function will be failed in the Forward since the URL has not changed, but the Redirect will be all the same.

5) Still with the same reason, with those operations have side-effect, say update the status of database, a Redirect should be used to avoid the refresh Forward to generate any duplicate operations.

Wednesday, May 12, 2010

Ubuntu Path Setting

/etc/profile is loaded once on login for every user
/etc/bash.bashrc is loaded every time every user opens a terminal
~/.bashrc is loaded every time a single user opens a terminal
~/.profile is loaded once when a single user logs on

Monday, May 10, 2010

Integrating Amazon S3 and CloudFront for video streaming

Amazon is a key player in the competition of cloud computing, and the service it provides can always satisfy our requirements if you can just dig a little bit deeper. Here, I'd like to show how to integrate the S3 and CloudFront service to provide video streaming service.

First, let's clarify some basic concepts of S3 and CloudFront. Amazon S3 is a storage provider which can store all kinds of data. But in order to deliver the content more rapidly for users globally, CloudFront uses its edge locations to obviously improve the responsive time. When people try to fetch a content via CloudFront, the server will automatically routes the user to the most closest location which host the replica of the data. However, CloudFront doesn't provide storage service, which is complemented by S3. That's why they always appear in the same place.

1. Distribute S3 bucket to CloudFront location

OK, so our first step is to link the S3 and CloudFront to allow distribution. It can be done with S3 Fox plugin of FireFox, or by programming. But we need to clarify another issue before we step forward. There are also two kinds of distribution existed in CloudFront, static file distribution and streaming distribution. Currently, the latest S3 Fox only supports the static distribution, which can be simply done by right clicking the bucket and 'manage the distribution'. In order to streaming distribution our bucket, we kind of have the only choice to code. Here is a short snippet of Java code to implement this function by using Jets3t library.

StreamingDistribution newStreamingDistribution = null;

try {

newStreamingDistribution = cloudFrontService.createStreamingDistribution(bucket.getName(), ""+ System.currentTimeMillis(),

null, "Test streaming distribution", true );

} catch (CloudFrontServiceException e1) {

log.error(e1.getMessage());

}

log.info("New Streaming Distribution: " + newStreamingDistribution);

StreamingDistributionConfig streamingDistributionConfig;

try {

streamingDistributionConfig = cloudFrontService.getStreamingDistributionConfig(newStreamingDistribution.getId());

log.info("Streaming Distribution Config: "+ streamingDistributionConfig);

} catch (CloudFrontServiceException e) {

log.error(e.getMessage());

}

2. After we enable the streaming distribution of the bucket, we will get a distribution URL. This is the base URL address that we will use throughout the whole process. Now we can use any methods to upload a multimedia file into the bucket we just created. Next we will use Flowplayer rtmp plugin to display the file into a browser.

Download the latest version of FlowPlayer as well as its rtmp plugin, and write a follow html page:

<html>

<head><title>Video</title><script src="flowplayer/flowplayer-3.1.4.min.js"></script>

</head>

<body>

<a class="rtmp" href="50f9307fbcdcdcaae65c4bc58857ca19-LOW" style="display:block;width:640px;height:360px;"></a>

<script type="text/javascript">$f("a.rtmp", "flowplayer/flowplayer-3.1.5.swf",

{clip:{provider: 'rtmp',autoPlay: true,},plugins: {rtmp: {url: 'flowplayer/flowplayer.rtmp-3.1.3.swf',netConnectionUrl: 'rtmp://s240vvr18v7md1.cloudfront.net/cfx/st'}}});</script>

</body></html>

Several places need to notice:

1) you must have the flowplayer.js, flowplayer.swf, and flowplayer.rtmp.swf ready to use and with the right path.

2) In the anchor tag, the href attribute is the file path/name of which you would like to play. Say for example, your file is XXX.mp3, and with the full path of http://AAA.s3.amazonaws.com/XXX.mp3, then you should place XXX in the href attribute. DON'T ADD THE EXTENSION!!!.

3) In the JavaScript section, the autoPlay indicate whether to run the file automatically, the netConnectionUrl must be set according to the distribution URL you retrieved from part 1. Remember, you must prefix "rtmp://" before the url and append "cfx/st" to the URL, and also you must not ignore the single quote around the whole URL.

Now, you can have your streaming video playing in your browser! Easy and Sweat!

Thursday, May 6, 2010

GAE/J deployment transaction conflict error 409

Today when I tried to deploy my application onto the App Engine, a weird error prompt up:

Unable to update app: Error posting to URL: https://appengine.google.com/api/appversion/create?app_id=metacdn&version=1&
409 Conflict
Another transaction by user featheast.lee is already in progress for this app and major version. That user can undo the transaction with appcfg.py's "rollback" command.
See the deployment console for more details

I haven't seen error before, and have no clue how it comes out. After several google page, I found a way to solve this problem.

1. Open your terminal and get into the directory of your project. ( For windows users, please do it in CMD accordingly).

2. Execute the sh script of appcfg.sh under with the parameter of rollback and war. You should prefix the path of your appcfg.sh, which is usually under the directory of eclipse plugin. ( Windows guys, continue to use cmd to replace the sh script)

3. Now after successfully rollback the deployment, you are free to do anything you want to now.

I guess the same story should be applied to Python as well, just simply use the appcfg.py instead of the script and things should work out.

PS: I got a ZipExeption during the rollback process with a warning that Could not find API version from .svn. Though it still solves my deploy issue.

Tuesday, May 4, 2010

HTTP and HTTPS setup a Restlet environment

Restlet is a handy tool for people to setup an environment running RESTful web service. In order to secure some endpoint resource, sometimes it will be required to use HTTPS for communication, and Restlet, no doubt, supports both of the two ways.

1) HTTP

It's pretty easy to setup the HTTP environment, all you need to do is create a new component, and register the HTTP protocol into the component's server and things will work out as expected.

Component component = new Component();
component.getServers().add(Protocol.HTTP, 8183);
component.getDefaultHost().attach(new XXXApplication());

Notice, the 8183 is the port number you have to provide.

2) HTTPS

Unlike HTTP, in HTTPS mode, you need provide three more things: keystore, keystorePassword, and keyPassword.

For those who are not familiar with keystore: A Java container of keys and certificates is called a keystore. There are two usages for keystores: as a keystore and as a truststore. The keystore contains the material of the local entity, that is the private key and certificate that will be used to connect to the remote entity. Its counterpart, the truststore, contains the certificates that should be used to check the authenticity of the remote entity's certificates.

The steps to construct a keystore is detailed on the page: http://wiki.restlet.org/docs_2.0/13-restlet/27-restlet/46-restlet/213-restlet.html. Basically speaking, you need to use a SSL tool to generate keys first, then self-signed the certification. After all these steps finished, following the code list below and you will run the HTTPS server successfully.

Component component = new Component();
Server server = component.getServers().add(Protocol.HTTPS, 8183);
server.getContext().getParameters.add("keystorePath", keystorePath);
server.getContext().getParameters.add("keystorePassword", keystorePassword);
server.getContext().getParameters.add("keyPassword', keyPassword);
component.start();

As the same theory, if you need to run any client side code as well under the same project, simply add the client's support protocol to the component as:

component.getClients().add(Protocol.HTTPS);

Monday, May 3, 2010

Using GAE python to bulk load CSV data into Java datastore

The official document is here: http://code.google.com/appengine/docs/python/tools/uploadingdata.html, more details will be covered here with Java applications.

1. Using any Windows environment to download Python SDK 2.5.X, preferably 2.5.4 since it is the last stable version with Windows Installer. Avoid to download 2.6.X and 3.X.X because GAE doesn't officially support these.

2. Download Google App Engine SDK for Python. Current version is 1.3.3. You may download GAE launcher which is only available in Windows.

3. Create a new project, naming it uploaddata (or whatever you like), add an app.yaml file

application: XXX
version: 1
runtime: python
api_version: 1

handlers:
-url: /remote_api
script: $PYTHON_LIB/google/appengine/ext/remote_api/handler.py
login: admin

Add above code to the app.yaml file. Use the correct application name and version, do not change the script of the handler.

4. Generate a python class, which can mapping the datastore table into a class. An example is:

(Student.py)
from google.appengine.ext import db

class Student(db.Model):
studentId = db.StringProperty()
name = db.StringProperty()
address = db.StringProperty()
......

5. Create a data loader file used by the handler. Here is another example:

(loader.py)
import datetime
from google.appengine.ext import db
from google.appengine.tools import bulkloader
import Student

class StudentLoader(bulkloader.Loader):
def __init__(self):
bulkloader.Loader.__init__(self, 'Student',
[('studentId', str), ('name', str), ('address', str)])
loaders = [StudentLoader]

Pay attention to those columns may contain characters in French Accent or Asian languages, use proper unicode to convert.

4. With command line, using the following command to upload data. With previous example, a sample command would look like:

appcfg.py upload_data --config_file=loader.py --filename=data.csv --kind=Student uploaddata

Wednesday, April 21, 2010

App Engine SDK 1.3.3 released

Although the Eclipse plug-in has not updated yet, I have tried to download the zip file. This is only a minor updated version with limited features. One of the most important thing worth to pay attention is the SQLite support for Python. Despite of the official word "Note that this feature does not add SQL support to the App Engine SDK or service", it is a great advantage for developers to harness the benefits of SQLite. Just wondering when will they expand the support to Java field. It seems Java is always a step backward compared to the development team of Python.

Tuesday, April 20, 2010

New Warning Message on Google App Engine

I just noticed a pretty new warning appears in the log of App Engine:

"This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application."

Previously, there is no such thing highlighted the reason why sometimes a simple operation might take longer CPU hours than expected. Now with this handy warning, things are much clear and you don't have to worry about any internal coding issues with your project.

However, if Google can solve this issue, it will be much better. Currently, in order to avoid such re-loading time waste, I have to reduce my cron job frequency to once per minute. Yep, when it runs in 2/min, App Engine will complain every time, and the overall CPU cost is even higher than the more frequently job. Ironically, Ridiculously, huh?

I know some guys are working this issue now, hopefully it can be fixed in the near future.

Friday, April 9, 2010

Install spring 3.0 plug-in in Eclipse 3.5

Actually this should be a pretty old topic, since I have done the same thing more than 5 times. However, without note down all the details, each time I have to google to solve some issue. Every time I just wondered why in the first place I keep this details down, now finally I decide to write all this down in case next time I have to google again and again.

So what you need to do is simply use the install new software tool in the Help menu. The address of spring plug-in is:http://springide.org/updatesite .And actually all you need is pretty clearly written on the web site of http://springide.org/project/wiki/SpringideInstall.

However, there might be several issues you will encounter. In my situation, you will always get a complaint about the missing component of "package org.eclipse.ajdt.internal.ui.lazystart 0.0.0". This package belongs to AJDT (which stands for AspectJ Development Tools) could be downloaded through the site of http://download.eclipse.org/tools/ajdt/35/update, and relevant info is here: http://www.eclipse.org/ajdt/downloads/.

After that try the spring plugin once again, in my case no complain any more and you should be able to get dirty with Spring finally.

PS: I have tried this method both in Ubuntu 9.10 and Mac OS Snow Leopard. Haven't try on Windows so please forgive me if it fails with Microsoft.

Monday, March 22, 2010

Warning in JDO lazy fetch with App Engine

If you consistently meets the warning "org.datanucleus.store.appengine.MetaDataValidator warn: Meta-data warning for ****: The datastore does not support joins and therefore cannot honor requests to place related objects in the default fetch group. The field will be fetched lazily on first access. You can modify this warning by setting the datanucleus.appengine.ignorableMetaDataBehavior property in your config. A value of NONE will silence the warning. A value of ERROR will turn the warning into an exception." like this, please add the following line into your jdoconfig.xml:

<property name="datanucleus.appengine.ignorableMetaDataBehavior" value="NONE" />

Although it is quite obvious, but it's the problem where you put it might confuse many people.

Wednesday, March 17, 2010

Using memcache in GAE/J

Actually memcache is not a brand new concept, it has been utilized in many large scale projects. The most famous memcache event will be the Whale in Twitter's front page. To understand the background knowledge of memcache, please Google it. In one sentence, memcache is a way to cache those frequently used data internal or external to reduce the cost of database query and remote invocation. This is a very classical way in dealing with all kinds of database, however since most of the codes are written in low level , most of the developers are out of touch. Here, with the help of memcache, to transplant this idea into the application level, which will definitely be a great boost to large scale applications.

Google App Engine supports memcache for a long time, which should be one of its born advantage. Here I'd like to briefly introduce how to use memcache in GAE/J. Sorry for those who are interested in Python.

First, let's construct a scenario. There is a university system in GAE/J which stores information for around 10k students. Such a system includes all kinds of information for students to use, such as enrollment, study blackboard, course selection and so on. As a result, there will be a huge demand on the database query. However, such kind of demand always falls in two parts, which part of the students who really like the system and would like to log in everyday, another part students who can be considered as "lazy" seldom care about this. As a result, to improve the efficiency of the system. to cache those frequently used information will be a good help. Here we assume there is a table called Student, no matter what kind of action to happen, there is always the need to query the data in Student table. Let's see how memcache to store the Student information.

First we generate a JDO POJO class to store Student information. As an example, the fields in the class are pretty simple.

[java]@PersistenceCapable(identityType = IdentityType.APPLICATION)

@Inheritance(customStrategy = “complete-table”)

public class Student implements Serializable{

@PrimaryKey

@Persistent

private String uuid;

@Persistent

private String name;

@Persistent

private String email;

@Persistent

private String address;

public Student(){

this.uuid = UUID.randomUUID().toString();

//setter&getter

}[/java]

Then we need to construct a Cache class which will in charge of the operations in Cache layer.

[java]

public class QueryCache {

private static final Logger log = Logger.getLogger(QueryCache.class

.getName());

private static QueryCache instance;

private Cache cache;

private QueryCache(){

try{

CacheFactory cacheFactory = CacheManager.getInstance().getCacheFactory();

cache = cacheFactory.createCache(Collections.emptyMap());

}catch(CacheException e){

log.severe(”Error in creating the cache”);

}

}

public static synchronized QueryCache getInstance(){

if(instance==null){

instance = new QueryCache();

}

return instance;

}

public void putInCache(String address, String student){

cache.put(address, student);

}

public String findInCache(String address){

if(cache.containsKey(address)){

return (String)cache.get(address);

}else{

return null;

}

}

}

[/java]

Inside this class, we generate a new Cache instance under the Singleton pattern. A map resides in this class. At the same time, we define two methods, one to put the student information into the cache and another to get the information out of the cache.

Finally, we construct a servlet to query the student information.

public class QueryServlet extends HttpServlet{
private static final Logger log = Logger.getLogger(QueryServlet.class.getName());
@Override
protected void doGet(HttpServletRequest req, HttpServletResposne resp) throws ServletException, IOException{
log.info(”Now start……”);
QueryCache cache = QueryCache.getInstance();
String studentC = cache.findInCache(”Address7694″);
if(studentC!=null){
resp.getWriter().write(”Found the item in cache!”);
}else{
resp.getWriter().write(”No hit in cache!”);
PersistenceManager pm = PMF.get().getPersistenceManager();
Query query = pm.newQuery(Student.class);
query.setFilter(”address==’Address7694′”);
List students = List query.execute();
if(students.iterator().hasNext()){
log.info(”Found one:”+student.toString());
resp.getWriter().write(”Found one:”+student.toString());
cache.putInCache(”Address7694″, student.toString());
}else{
log.info(”None found!”);
resp.getWriter().write(”None Found!”);
}
}
}

This is a very simple example to briefly show how to use memcache in GAE/J. However, a lot more things need to think about in reality such as where to use memcache, how to set the expire time of each cache, etc.

Sunday, March 14, 2010

Export a jar file in Eclipse project

If you would like to export your eclipse project into a jar file, maybe the first thing comes out of your mind is to build with Ant. But here I'd like to recommend a very handy tool called FatJar, which is a very helpful plug-in for eclipse to package your project.

Really simple, just install the plug-in from http://kurucz-grafika.de/fatjar. Then export as a fat jar will generate the jar file you need.

More details can be found in its tutorial, just Google it.

Tuesday, March 9, 2010

A piece of Java code to split large file

Working with Azure recently, sometimes when you are trying to upload large files into the Azure storage service, you cannot simply push it. For files larger than 64MB, you have to split the file into small trunks, and upload each of them as a blob list. It's not hard to understand the concept however split large files may not seem to be easy for junior developers.

Here is a simple code to split large files into small pieces which may be helpful if tbis is what you want to achieve:

class FileSplit {

private File f;

private FileInputStream fis;

private String path;

private String fileName;

int count;

public FileSplit(File f) {

this.f = f;

fileName = f.getName();

count = 0;

path = f.getParent();

}

public int split() {

try {

log.info("Start to split files");

fis = new FileInputStream(f);

byte buf[] = new byte[4 * 1000 * 1000];

int num = 0;

while ((num = fis.read(buf)) != -1) {

if (createSplitFile(buf, 0, num) == -1) {

return 0;

}

count++;

log.info("Finished one piece");

}

log.info("All finished");

} catch (Exception e) {

log.severe(e.getMessage());

} finally {

if (fis != null) {

try {

fis.close();

} catch (Exception e) {

log.severe(e.getMessage());

}

}

}

return count;

}

private int createSplitFile(byte buf[], int zero, int num) {

FileOutputStream fosTemp = null;

try {

fosTemp = new FileOutputStream(path + "/" + count + ".tmppt");

fosTemp.write(buf, zero, num);

fosTemp.flush();

} catch (Exception e) {

return -1;

} finally {

try {

fosTemp.close();

} catch (Exception e) {

log.severe(e.getMessage());

}

}

return 1;

}

}

Monday, March 8, 2010

Clarify the differences between Amazon S3 and Amazon CloudFront

While cloud computing is the buzzword around world, Amazon is no doubt one of the most important competitors in this field. The products S3 and CloudFront both play their vital role for cloud storages and services.

However, there is always some misunderstandings between these two services. Here is a simple clarification which may be a little bit helpful.

Amazon S3 is a storage service, which means it is solely used to store the data in some cloud out there, and you have no choice to replicate this data in any other places. The Amazon S3 will provide you a URL which points to this specific piece of data, and each time you will be routed to the same IP to acquire this data.

Amazon CloudFront is a CDN service, which replicates all your data into different locations around the world. When you are in NY, the nearest node to download your data will be in US not in Europe; when you are in London, CloudFront will route you to Dublin which is far less distance than in US. This effectively shortened the length for data transmission.

As above we can see that most of the time, you will be under the assumption that you have to use both of these two services at the same time to improve your data's efficiency.

Sunday, March 7, 2010

Using Google Code as a SVN repository

Although Google is branded as a searching company, now it has come into every corner of IT world. I just realized that without Google, I will be kind of in a situation that cannot live comfortable any more. Using Gmail to send and receive mails, using Google Docs to create, edit and share documents, using Google Calendar to arrange weekly and daily schedule, using Google Reader to read the latest information and news around, using Google Map to discover and explore neighborhood and destinations, using Google Buzz and Wave to socialize, and more importantly for me, using Google App Engine to earn money.

Recently, I have a thought to take participant in the open-source world, as a result the first thing came into my mind is to create some small project on the Google Code, which is previously dominated by SourceForge. After some attempting, I have constructed a very small project with the name of "restfulhttpclient". It is written in Java, with the IDE of Netbeans 6.8. Consequently, the checking out code will in the structure in Netbeans.

It is not difficult to adopt Google Code as SVN repository, since it provides the most basic functions to host code, and once you have an account and create a project there, you just need simply using your familiar SVN client to commit and check out your code.

However, I do have some questions here, the one that confused me most is that why should I name my project in all lowercase characters. Some one may argue it will be easier for lowercase letters to display in the address URI and for people to type in, but I believe a simple mapping and checking mechanism will not be that hard.

Another thing is actually for each of the project, there are two different URIs mapping to this project. One in the format of ***.googlecode.com, and the other is code.google.com/p/***. Whenever you try to upload your file, it is better to choose the first one, since the latter one will give you a 400 bad request response.

All right, if you have some interest, try to download my project in a while ( not finished yet).

http://code.google.com/p/restfulhttpclient

The reason why it is called restful http client is simply because of I have to use those basic functions as work. As developing a project full of Restful web services, invoking a http request will be the most common task. A handy GUI tool will be really nice if it can covers those most usual functions. Even a lot of similar products are out there, I cannot find one that most suits me. Either some of them have too many capabilities that made them extremely difficult to master, or some just ignores certain part of the function that I have to use. In this project, users can send GET, POST, PUT and DELETE request to server, you can add specific headers, and basic authentication informations. The response code and message will also be displayed on the panel once received by the client.