Archive

Archive for the ‘English - 英文’ Category

My AD Fun Experiment

July 29th, 2009 Bali No comments

Today I got a mail from lakequincy.com, saying:

“Hi Bill,

I noticed that you were never able to plug the Lake Quincy Media ad tags into your site. Are you still interested in earning revenue from displaying ads targeted to Microsoft developers? If so, you can get your tags from here: <certain link>

Let me know if you have any questions or if you’ve decided against running the ads and I’ll set your account to inactive for you.”

I even forgot that I resgistered in their site. To encourage such great customer service, I go ahead to try how the AD really works for me; finally decide to put a small square in the side bar. You can easily find it now if you scroll down a bit and pay attention to the left side bar. Let us see how many dollars I can make after one or two quarters. By putting an AD to the blog, it looks more of-the-business, doesn’t it? Of course, you can bid for that AD position. :- )

Update(2/26/2009)

The AD is disabled temporarily due to security alert.

P2P Backup System w/o SPOF for Work Group

July 29th, 2009 Bali No comments

This is also one of my half-completed ideas years ago. It was recalled recently by two stories:

Stories

#1: One of my team mates lost his Outlook email archive due to a mistaken operation. He is very upset because all of his emails in past two years are gone, unrecoverablly. Suggested Outlook mail file is 2GB, but people can easily drive it up to 10GB and often notice this too late. As what he said in MSN signature, “my archive, 555…”

#2: While I was browsing one of my favorite bloggers – Brad Abrams’s blog, ran into a shocking post, Help! My hard disk crashed. He can afford a new disk, but he need the data – especially “the pictures of the kids recent birthday party”. In the comments, people suggested various ways to recover it, unfortunately looks like no luck.

This leads to the most basic questions: why people don’t backup data even if they know they are in the risk of losing them? Probably because of:

1) The chance is very tiny

2) Current backup solutions are not “good enough”

For 1), although it might be true, but remember the result might be cataclysmic. Let us take a look at what is going on about 2).

Existing Backup Solutions

Traditional C/S – Most of current backup solutions are based on traditional C/S architecture. Assume one use 40% of 500GB disk drive on average; it is still a big capacity for enterprise with thousands of staff. Backup time is also another concern. Let me illustrate this with some math. For a backup server with two 1-Gbit NICs, its max throughput is about 200MB. If backup window(people are sleeping) is 10 hours, it means it can back up about 7.2TB(2*100*3600*10MB) data. Obviously it is hardly scalable along with business expansion.

Online backup – People can turn to AWS S3, or Azure SQL Data Services to simplify things on their own side. But this doesn’t make things even better if you look at current network speed. Calculating data is much easier than moving data around.

My P2P Backup System

p2p backup comes with several inherent advantages in handy:

1) No additional hardware expense. People often have at least 30% capacity left; we can leverage that for backup space.

2) No obvious network bottleneck.

Some people might also list “no single point of failure”, but that is not true. Let us take a look at how BT works, a general P2P network.

Basically here are what happens in P2P network – peers need talk to a central coordination server(i.e., Tracker in BT system) to get info about other peers, and then talk to individual peers about real data exchange. These two steps are essential. Single Point of Failure(SPOF) can happen in the coordination server. You may argue that you can enhance its reliability by techniques like mirroring, but does it look like over-heavyweight? We have many reliable services running by full time IT guys, why do we bother to re-invent the wheel? Several services can be utilized, such as AD, Sharepoint, exchange, which allow customized data writing. Data needed written into a central place would be:

1) Peer list by <IP, port>

2) Other meta data such as backup network name, Software RAID level

Of course, we need do compression, encryption, incremental backup, scheduled backup in the solution. I may do investigation later and update the post.

Also see other backup solutions

http://www.storegrid.com/index.html – select a folder, then backup, traditional

www.streamload.com – Large net disk, Upload your files to streamload, then view it, or email it to someone else.

http://www.beinsync.com/ – Access you computer: install a mini PHP web server onto it.

http://base.google.com/base/about.html – In about 15 minutes, your item will have a unique web address and be visible to the world.

http://www.openomy.com/openomy is an online file storage system designed to be a platform for Web 2.0 applications, built by Ian Sefferman

Memory Leaks Demo & Detection in .NET Application

July 29th, 2009 Bali No comments

Memory leaks are always headache of developers. Do .NET developers no longer bother to worry about memory leaks because of garbage collection? Yes and NO. GC periodically find objects that cannot be accessed in the future and then reclaim the resources used by the objects. GC achieves this by maintaining a list of references to live objects. When this mechanism is broken, memory leak happens.

There are many reasons to leak memory. In addition to calling unmanaged code from managed code, another one of general cases is about event handler. If you do this:

Foo.FooEvent += new EventHandler(MemoryLeaksHere.Method);

When you complete using MemoryLeaksHere, but you are still using Foo, then MemoryLeaksHere will still remain alive as well. MemoryLeaksHere object will leak memory as a result of failing to GC.

Let us take a look at one simple example first.

using System;

namespace MemoryLeakSample

{

class Foo

{

public static Foo myFoo;

public event EventHandler FooEvent;

public Foo()

{

myFoo = this;

}

public void FooMethod()

{

MemoryLeaksHere memLeak = new MemoryLeaksHere();

memLeak.TryQuit();

}

public void FireEvent()

{

FooEvent(null, null);

}

static void Main(string[] args)

{

Foo foo = new Foo();

for (int i = 0; i < 5; ++i)

{

foo.FooMethod();

}

GC.Collect();

GC.WaitForPendingFinalizers();

GC.Collect();

Console.WriteLine(“Check memory leak here.”);

}

}

/// <summary>

/// This object will cause memory leak

/// </summary>

public class MemoryLeaksHere

{

public MemoryLeaksHere()

{

Foo.myFoo.FooEvent += new EventHandler(OnMyFooEventFired);

Console.WriteLine(“\nObject-{0}: Construct. Subscribe.”, this.GetHashCode());

}

~MemoryLeaksHere()

{

Console.WriteLine(“Object-{0}: Deconstruct.”, this.GetHashCode());

}

public void TryQuit()

{

Console.Write(“Object-{0}: leak me?”, this.GetHashCode());

string input = Console.ReadLine();

if (string.Equals(input, “no”))

{

Foo.myFoo.FooEvent -= new EventHandler(OnMyFooEventFired);

Console.WriteLine(“Object-{0}: Unsubscribe.”, this.GetHashCode());

}

else

{

Console.WriteLine(“Object-{0}: Not Unsubscribe”, this.GetHashCode());

}

}

private void OnMyFooEventFired(object sender, EventArgs e)

{

// Do something

}

}

}

In MemoryLeaksHere object’s constructor, Foo starts to hold a reference to MemoryLeaksHere by registering event handler. In MemoryLeaksHere.TryQuit(), if we don’t unregister, memory leak will happen.

To be more intuitive, you can copy/paste sample code to VS2008, and then enable unmanged code debugging by following:

Project->Properties->Debug->Enable Unmanaged Code debugging

Now set a breakpoint at Check memory leak here”, and start build/debug. When being asked leak me or not, you can choose either yes or no. For example:

Here, looks like we leak two of them. Finally app will hit the breakpoint and stop. At this point, we can go to VS immedate window to load sos.dll, and then check how many objects in the heap:

!load sos.dll

extension C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\sos.dll loaded

!dumpheap -type MemoryLeaksHere

PDB symbol for mscorwks.dll not loaded

Address MT Size

0132e7d0 00983104 12

0132eba0 00983104 12

total 2 objects

Statistics:

MT Count TotalSize Class Name

00983104 2 24 MemoryLeakSample.MemoryLeaksHere

Total 2 objects

So now we know there are two object instances are not recycled. Why are they not GC-ed? Because someone has a reference to them. Choose one of them, and use gcroot command.

!gcroot 0132e7d0

Note: Roots found on stacks may be false positives. Run “!help gcroot” for

more info.

Error during command: Warning. Extension is using a callback which Visual Studio does not implement.

Scan Thread 7592 OSTHread 1da8

ESP:12f434:Root:01312d48(MemoryLeakSample.Foo)->

0132f704(System.EventHandler)->

0132f6ec(System.Object[])->

0132e7dc(System.EventHandler)->

0132e7d0(MemoryLeakSample.MemoryLeaksHere)

Scan Thread 4704 OSTHread 1260

Now we can see that MemoryLeakSample.Foo is still referencing MemoryLeakSample.MemoryLeaksHere via event handler. If it is not 5 iterations, image what would happen if every incoming request results in a slice of memory leak… Soon or later, you online service will be down.

See also:

http://www.codeproject.com/KB/dotnet/Memory_Leak_Detection.aspx

http://blogs.msdn.com/jgoldb/archive/2008/02/04/finding-memory-leaks-in-wpf-based-applications.aspx

http://blogs.msdn.com/calvin_hsia/archive/2008/04/11/8381838.aspx

http://www.automatedqa.com/techpapers/net_allocation_profiler.asp

http://blogs.msdn.com/greg_schechter/archive/2004/05/27/143605.aspx

Designing Your Own Recent Posts Widget for MSDN Blog

July 29th, 2009 Bali 1 comment

In my MSDN blog, I need “Recent Posts”, but I don’t need archive side bar. After having played with template for a while, still no luck. Hmmm, looks like I have to DIY it. Fortunately in News sidebar, you can fill in raw html including JavaScript. Then next question is where we can retrieve post tiles. The immediate idea is from current DOM document. Through experiments, I found this is impossible because the DOM is not fully loaded yet when the script is executed. Later, I figured it out that all posts title can be gotten from RSS. For my blog, the address is http://blogs.msdn.com/bali_msft/rss.xml. One thing worth noticing is the fact that RSS in MSDN blog is not up to date – Your post will not instantly appear in the RSS. After I get all posts in RSS format, things became much easier. And then I go ahead to add more interesting things:

  • Posts background use two colors in turn
  • Show a new tag for posts less than 3 days old
  • Show latest 8 posts only
  • Show posts’ age

So, the final thing will look like this:

If you find it is useful, feel free to paste below code to you blog’s news section. Note to customize “Configurable params” to your own needs and leave other code intact. It works well at least in my IE 8 and Firefox 3.0.6.

<div id=”RecentPosts”></div>

<Script>

// Configurable params

var recentPostNumber = 8;

var rssUrl = “http://blogs.msdn.com/bali_msft/rss.xml”;

var title = “My Recent Posts”;

var newPostAgeInHour = 72;

// Cacluate age of one post. It is all about getting time span in Javascript

// return formate: x min; x hour y min, x day y min, x days, x yeas (ago)

// Refer to: http://www.w3schools.com/jsref/jsref_obj_date.asp

function calculateAge(postDate)

{

var ret = “fresh!”;

CurrentDate = new Date();

TimeSpan = new Date(CurrentDate – postDate);

var mySpanArray = new Array();

mySpanArray[0] = TimeSpan.getUTCFullYear()-1970;
mySpanArray[1] = TimeSpan.getUTCMonth();
mySpanArray[2] = TimeSpan.getUTCDate()-1;
mySpanArray[3] = TimeSpan.getUTCHours();
mySpanArray[4] = TimeSpan.getUTCMinutes();

var TimeSpanTagArray_1 = new Array(“years”, “months”, “days”, “hours”, “minutes”);

var TimeSpanTagArray_2 = new Array(“year”, “month”, “day”, “hour”, “minute”);

// Starting from non-zero element and pick two significant values

for(i = 0; i < mySpanArray.length; i++) {

if(mySpanArray[i] != 0) {

var correctTag = (mySpanArray[i] == 1)?(TimeSpanTagArray_2[i]):(TimeSpanTagArray_1[i]);

ret = mySpanArray[i] + ” “ + correctTag;

if(i+1 < mySpanArray.length && mySpanArray[i+1] != 0) {

correctTag = (mySpanArray[i+1] == 1)?(TimeSpanTagArray_2[i+1]):(TimeSpanTagArray_1[i+1]);

ret = ret + “, “ + mySpanArray[i+1] + ” “ + correctTag;

}

break;

}

}

return ret;

}

// Display the recent posts

// Refer to:

// http://www.w3schools.com/DOM/dom_node.asp

// http://www.w3schools.com/DOM/dom_methods.asp

function displayPosts (xmldoc)

{

var newTag = “<SPAN style=\”COLOR: red\”>(New!)</SPAN>”;

var posts = xmldoc.getElementsByTagName(“item”);

var displayText = “<h3>” + title + “</h3><UL>”;

if (posts.length < recentPostNumber) {

recentPostNumber = posts.length;

}

for(var i = 0; i < recentPostNumber; i++)

{

PostTitle = posts[i].firstChild.firstChild.nodeValue;

PostLink = posts[i].firstChild.nextSibling.firstChild.nodeValue;

PostDateStr = posts[i].firstChild.nextSibling.nextSibling.firstChild.nodeValue;

PostDate = new Date(PostDateStr);

CurrentDate = new Date();

// Calculate age

var _PostAge = calculateAge(PostDate);

var PostAge = “<SPAN style=\”font-size: 80%; color: black\”> (“ +_PostAge + ” ago)</SPAN>”;

// Show a new tag for posts happening last days defined by ‘newPostAgeInHour’

var myNewTag = “”;

if((CurrentDate.getTime() – PostDate.getTime())/1000/60 < newPostAgeInHour * 60) {

myNewTag = newTag;

}

// Get background color

var BKColor = (i%2 == 0)?(“#B8CCE4″):(“#DBE5F1″);

displayText = displayText + “<LI style=\”background-color:” + BKColor + “\”><A href=\”" + PostLink + “\”>” + myNewTag + PostTitle + PostAge + “</A></LI>”

}

displayText = displayText + “</UL>”;

var target = document.getElementById(“RecentPosts”);

target.innerHTML=displayText;

}

// Call back

function complete(){

if (req.readyState == 4) {

if (req.status == 200) {

displayPosts (req.responseXML);

}

}

}

// Initial async call

function getPosts()

{

if (window.XMLHttpRequest) {

req = new XMLHttpRequest();

}else if (window.ActiveXObject) {

req = new ActiveXObject(“Microsoft.XMLHTTP”);

}

if(req){

req.open(“GET”, rssUrl, true);

req.onreadystatechange = complete;

req.send(null);

}

}

// Entry point

getPosts();

</Script>

Searching For a Number in Shifted Sorted Array within O(log(n)) time

July 29th, 2009 Bali No comments

Run into the algorithm problem long time ago. Now post my answer here. A sorted array, say: {1,2,3,4,5,6,7,8,9,10,11,12}, do right rotate through carry unknown times, and then it might become: {6,7,8,9,10,11,12,1,2,3,4,5}. Now we need get the index of a given number, say 4, from the array within O(log(n)) time. Apparently a 8-year-old can get it done with O(n) time.

We can think of it this way: take the middle element of array, if target is found, fine; if not, and then array become two parts, one is sorted array, the other is shifted sorted array. As illustrated as below diagram:

If the target falls into the sorted array half, we can simple do a binary search; otherwise, repeat this operation in the other half in recursive way. You can see this is divide-and-conquer algorithm. Obviously this is O(log(n)).

Code

//

// A typical binary search implementation

//

int _BinarySearch(unsigned int ShiftedArray[], unsigned int start, unsigned int end, unsigned int target)

{

// Not found

if( start == end && ShiftedArray[start] != target) {

return -1;

}

unsigned int middle = start + (end – start)/2;

if(target == ShiftedArray[middle])

{

return middle;

} else if (target > ShiftedArray[middle]) {

return _BinarySearch(ShiftedArray, middle + 1, end, target);

} else {

return _BinarySearch(ShiftedArray, start, middle – 1, target);

}

}

//

// Select a given number from shifted array.

// ShiftedArray is something like = {6,7,8,9,10,11,12,1,2,3,4,5}

// If found, return index of the number; if not, reutrn -1

// Require log(N)

//

int SearchShiftedArray(unsigned int ShiftedArray[], unsigned int start, unsigned int end, unsigned int target)

{

// Start meets end

if( start == end && ShiftedArray[start] != target) {

return -1;

}

unsigned int middle = start + (end – start)/2;

if(target == ShiftedArray[middle])

{

return middle;

} else if(ShiftedArray[middle] < ShiftedArray[start]) { // Right half is sorted linearly

if((target > ShiftedArray[middle]) && (target <= ShiftedArray[end])) {

return _BinarySearch(ShiftedArray, middle + 1, end, target);

} else {

return SearchShiftedArray(ShiftedArray, start, middle-1, target);

}

} else { // Left half is sorted linearly

if((target >= ShiftedArray[start]) && (target < ShiftedArray[middle])) {

return _BinarySearch(ShiftedArray, start, middle – 1, target);

} else {

return SearchShiftedArray(ShiftedArray, middle + 1, end, target);

}

}

}

Test cases

Positive: {6,7,8,9,10,11,12,1,2,3,4,5}, target = 3, target = 8

Negative: {6,7,8,9,10,11,12,1,2,3,4,5}, target = 0, target = 13

Boundary: {6,7,8,9,10,11,12,1,2,3,4,5}, target = 6, target = 5

Exceptional: {…max}, target = max

One more interesting thing is the statement that “only about 10 percent of the professional programmers implemented binary search correctly.” Do you know why? Check this.