Why Johnny Can’t Write Multithreaded Programs

Programming for multiple threads is not fundamentally different from writing an event-oriented GUI application or even a straight up sequential application. The important lessons of encapsulation, separation of concerns, loose coupling, etc. all apply. But developers get into trouble with multiple threads when they don’t apply those lessons; instead they try to apply the mostly-irrelevant bits of information they learned about threads and synchronization primitives from introductory multithreading texts.

Some people, when confronted with a problem, think, “I know, I’ll use regular expressions.” Now they have two problems. –Jaimie Zawinski

Some people, when confronted with a problem, think, “I know, I’ll use threads!” Now they have 10 problems. –Bill Schindler

Too many programmers writing multithreaded programs are like Mickey Mouse inThe Sorcerer’s Apprentice. They learn to create a bunch of threads and get them mostly working, but then the threads go completely out of control and the programmer doesn’t know what to do.

Unlike Mickey, those programmers don’t have the luxury of a kindly master wizard who can wave his magic wand and restore sanity. Instead, the programmer resorts to all manner of ugly hacks in an attempt to fix problems as they pop up. The result is invariably an overly complicated, restrictive, fragile, and unreliable application that’s prone to deadlocks and other multithreading hazards. Not to mention unexplained crashes, poor performance, and incomplete or incorrect results.

You’ve probably wondered why that is. Perhaps you’ve accepted the common fallacy that “Multithreading is hard.” It’s not. If a multithreaded program is unreliable it’s most likely due to the same reasons that single-threaded programs fail: The programmer didn’t follow basic, well known development practices. Multithreaded programs seem harder or more complex to write because two or more concurrent threads working incorrectly make a much bigger mess a whole lot faster than a single thread can.

The “multithreading is hard” fallacy is propagated by programmers who, out of their depth in a single-threaded world, jump feet first into multithreading – and drown. Rather than re-examine their development practices or preconceived notions, they stubbornly try to “fix” things, and use the “multithreading is hard” excuse to justify their unreliable programs and missed delivery dates.

Note that I’m talking here about the majority of programs that use multithreading. There are difficult multithreading scenarios, just as there are difficult scenarios in the single-threaded world. But those are relatively rare. For the majority of what most programmers do, the problems just aren’t that complicated. We move data around, transform it, perhaps do some calculations from time to time, and finally store the results in a database or display them on the screen.

Upgrading a typical single-threaded program so that it uses multiple threads isn’t (or shouldn’t be) very difficult. It becomes difficult for two reasons:

  • Developers fail to apply simple, well known development practices; and
  • Most of what they were taught in introductory multithreading materials is technically correct but completely irrelevant to the problems at hand.

The most important concepts in programming are universal; they apply equally to single-threaded and multithreaded programs. Programmers who drown in a sea of threads haven’t learned the important lessons from writing single-threaded programs. I know this because they make the same fundamental mistakes in their multithreaded programs as they do in their single-threaded programs.

Probably the most important lesson to be learned from the past 60 years of software development is that global mutable state is bad. Really bad. Programs that depend on global mutable state are harder to reason about and generally less reliable, because there are too many possible ways for the state to change. There is a huge amount of research to back up that generalization, and countless design patterns whose primary purpose is to implement some type of data hiding. The best thing you can do to make your programs easier to reason about is to eliminate as much global mutable state as possible.

In a single-threaded sequential program, the likelihood of data being mangled is proportional to the number of components that can modify that data.

It’s usually not possible to completely eliminate global state, but we developers have very effective tools for strictly controlling which parts of a program can modify it. In addition, we’ve learned to create restrictive API layers around primitive data structures so that we also control how those data structures are changed.

The problems of global mutable state became more apparent in the late ’80s and early ’90s with the widespread use of event-oriented programming. Programs no longer start at the beginning and follow a single predictable path to conclusion. Instead, the program has an initial state and events occur at unpredictable times in an unpredictable order. The code is still single-threaded, but it’s asynchronous. The likelihood of data being mangled increases because the order in which events can occur is a factor. It’s not uncommon to find that if event A occurs before event B, then everything’s fine. But if A follows B, especially if event C occurs in between, then the data is mangled beyond recognition.

Adding concurrent threads complicates the problem even further because multiple methods can manipulate the global state at the same time. It becomes impossible to reason about how the global state is changing. Not only is the order of events unpredictable, but multiple threads of execution can be updating the state at the same time. At least in the asynchronous case you can guarantee that one event will complete its processing before any other event can start. In short, it is possible to say with certainty what the global state will be at the end of an event’s processing. With multiple threads it’s impossible in the general case to say which events will execute concurrently, and it’s therefore impossible to say what the global state is at any given point in time.

A multithreaded program with extensive global mutable state is one of the best demonstrations of the Heisenberg uncertainty principle I know of. It’s impossible to examine the state without changing the program’s behavior.

When I launch into my prepared rant about global mutable state (a somewhat expanded version of the last few paragraphs), programmers roll their eyes and tell me that they already know that. If they do know that, their programs don’t show it. The programs are filled with global mutable state, and the programmers wonder why their programs don’t work.

Not surprisingly, the most important part of creating a multithreaded program is design: figuring out what the program has to do, designing independent modules to perform those functions, clearly identifying what data each module needs, and defining the communications paths between modules. [Also: designing the project team’s t-shirt. Some things take priority. –Ed.] The fundamental process is no different from designing a single-threaded program. The key to success is, as with a single-threaded program, limiting interactions between the modules. If you eliminate shared mutable state, then data sharing problems are impossible.

You might think that you can’t afford the time to design your application so that it doesn’t use global state. In my opinion you can’t afford not to. Trying to manage global mutable state kills more multithreaded programs than anything else. The more you have to manage, the more likely it is that your program will crash and burn.

Most real world programs require some shared state that can be changed, and that’s where programmers most often get into trouble. Seeing the need for sharing state, programmers often reach into their multithreading toolbox and pull out the only tool they have: the all-purpose lock (critical section, mutex, or whatever it’s called in their particular language). They figure, I suppose, that they can eliminate the data sharing problems with mutual exclusion.

The number of problems you can encounter with a single lock is astounding. There are race conditions to think about, gating problems with an overly broad lock, and fairness issues, just to name a few. If you have multiple locks, especially nested locks, you have to worry about deadlock, livelock, lock convoys, and other concurrency hazards in addition to the problems associated with a single lock. Things get complicated in a hurry.

When writing or reviewing application code, I have a simple rule of thumb that rarely fails: If you used a lock, you probably did something wrong.

That statement can be taken two ways:

  1. If you need a lock, then you probably have global mutable state that has to be protected against concurrent updates. The existence of global mutable state indicates a flaw in the application’s design, which you should review and change.
  2. Locks are difficult to use correctly, and locking bugs can be incredibly difficult to isolate. The likelihood of there being an error in the way you used the lock is very high. If I see a lock, especially in a program that exhibits unusual behavior, the first place I look for the failure is the code that depends on the lock being used correctly. And that’s where I usually find it.

Both of those interpretations apply.

Multithreading isn’t hard. Properly using synchronization primitives, though, is really, really, hard. You probably aren’t qualified to use even a single lock properly. Locks and other synchronization primitives are systems level constructs. People who know a lot more about multithreading use those constructs to build concurrent data structures and higher level synchronization constructs that mere application programmers like you and I use in our programs. Application programmers should use the low-level synchronization primitives about as often as they make direct device driver calls: almost never.

Trying to solve a data sharing problem with locks is like trying to put out a fire by throwing liquid oxygen on it. As with fires, prevention is the best solution. If you eliminate shared state, you have no reason to misuse those synchronization primitives.

Most of what you know about multithreading is irrelevant

Introductory multithreading materials explain what threads are. Then they launch into discussions of how to make those threads work together in various ways, such as controlling access to shared data with locks and semaphores, and perhaps controlling when things happen with events. There’s detailed discussion of condition variables, memory barriers, critical sections, mutexes, volatile fields, and atomic operations. You’re given examples of how to use those low level constructs to do all manner of systems level things. By the time a programmer is halfway through that material, she thinks she knows how to use those primitives in her applications. After all, if you understand how to use something at the systems level, using it at the application level should be trivial, right?

This is like teaching a teenager how to build an internal combustion engine from discrete parts and then, without the benefit of any driving instruction, setting him behind the wheel of a car and turning him loose on the roads. The teenager understands how the car works internally, but he has no idea how to drive it from point A to point B.

Knowing how threads work at the systems level is mostly irrelevant to understanding how to use them in an application program. I’m not saying that programmers shouldn’t know how things work under the hood, just that they shouldn’t expect that knowledge to be directly applicable to the design or implementation of a business application. After all, knowing the details of the intake, compression, combustion, and exhaust cycle doesn’t help you in getting from home to the grocery store and back.

Introductory multithreading textbooks (and computer science courses) shouldn’t be teaching those low level constructs. Rather, they should concentrate on common classes of problems and show developers how to use higher level constructs to solve those problems. For example, a large number of business applications are in concept extremely simple programs: They read data from one or more input devices, apply some arbitrarily complex processing to that data (perhaps querying some other stored data in the process), and then output the results.

These programs very often fit nicely into a producer-consumer model with three threads:

  • The input thread reads data and places it on the input queue.
  • The processing thread reads records from the input queue, processes them, and puts them on the output queue.
  • The output thread reads records from the output queue and stores them.

The three threads operate independently and communicate through the queues. Although technically those queues are shared state, in practice they are communications channels with their own internal, synchronization. The queues support multiple producers and consumers, all adding or removing items concurrently.

Because the input, processing, and output are each isolated, it’s easy to change their implementations without affecting the rest of the program. As long as the queue data types remain unchanged, the individual pieces can be refactored at will. In addition, because the queues handle an arbitrary number of producers and consumers, adding more producers or consumers is no problem. There could be a dozen input threads all writing to the same queue, or multiple processing threads removing input items and crunching the data. Within the confines of a single computer, this model scales well.

Perhaps most importantly, modern programming languages and libraries make it easy to create a producer-consumer application. In .NET you have concurrent collections and TPL Dataflow. Java has the Executer service, BlockingQueue, and other classes in the java.util.concurrent namespace. In C++ you have the Boost threading library and Intel’s Thread Building Blocks. Microsoft introduced its Asynchronous Agents with Visual Studio 2013. Similar libraries are available for Python, Javascript, Ruby, PHP, and for all I know many other languages. You can create a producer-consumer application with any of those packages without ever having to use a lock, semaphore, condition variable, or any other synchronization primitive.

Granted, those libraries likely make liberal use of many different synchronization primitives. That’s okay. Those libraries were written by people who know multithreading a whole lot better than does your average application programmer. Using a library like that is no different from using a language’s runtime library, or writing in a high level language rather than Assembly language.

The producer-consumer model is just one example. The libraries I mentioned above include classes with which you can implement many common multithreading design patterns without once dipping into low-level multithreading. It’s possible to create extensive multithreading applications without knowing a thing about how threads and synchronization work under the hood.

Use the libraries

Writing programs that use multiple threads is not fundamentally different from writing single-threaded synchronous programs. The important lessons of encapsulation and data hiding are universal, and become even more important when multiple concurrent threads are involved. If you ignore those important lessons, then no amount of low level threading knowledge can save you.

Programmers today have plenty to worry about at the application level without having to think about systems-level things. As applications become more involved, we increasingly hide complexity behind API layers. We’ve been doing this for decades. One could make a good argument that hiding complexity from programmers is the primary reason they are able to create complex applications. After all, don’t we already hide the complexities of the file system, the UI message loop, low-level communication protocols, etc.?

Multithreading concepts should be no different. The majority of multithreading scenarios business programmers are likely encounter are well known and implemented in libraries that hide the bewildering complexity of dealing with concurrency. We should use those libraries in the same way that we use libraries of user interface controls, communications protocols, and the countless other tools that simplify our jobs. Leave low level multithreading to the people who know what they’re doing: the ones who write the libraries we use to build real programs.

Jim Mischel is a developer with Professional Datasolutions, Inc., a leading provider of software, hardware, and professional services to convenience retailers and wholesale petroleum marketers. When he’s not banging out code or writing about his experiences, he’s probably putting in miles on his bike or working on his latest wood carving project. Keep up with Jim on his blog.

Thread Interference

Consider a simple class called Counter

class Counter {
    private int c = 0;

    public void increment() {
        c++;
    }

    public void decrement() {
        c--;
    }

    public int value() {
        return c;
    }

}

Counter is designed so that each invocation of increment will add 1 to c, and each invocation of decrement will subtract 1 from c. However, if a Counter object is referenced from multiple threads, interference between threads may prevent this from happening as expected.

Interference happens when two operations, running in different threads, but acting on the same data, interleave. This means that the two operations consist of multiple steps, and the sequences of steps overlap.

It might not seem possible for operations on instances of Counter to interleave, since both operations on c are single, simple statements. However, even simple statements can translate to multiple steps by the virtual machine. We won’t examine the specific steps the virtual machine takes — it is enough to know that the single expression c++ can be decomposed into three steps:

  1. Retrieve the current value of c.
  2. Increment the retrieved value by 1.
  3. Store the incremented value back in c.

The expression c-- can be decomposed the same way, except that the second step decrements instead of increments.

Suppose Thread A invokes increment at about the same time Thread B invokes decrement. If the initial value of c is 0, their interleaved actions might follow this sequence:

  1. Thread A: Retrieve c.
  2. Thread B: Retrieve c.
  3. Thread A: Increment retrieved value; result is 1.
  4. Thread B: Decrement retrieved value; result is -1.
  5. Thread A: Store result in c; c is now 1.
  6. Thread B: Store result in c; c is now -1.

Thread A’s result is lost, overwritten by Thread B. This particular interleaving is only one possibility. Under different circumstances it might be Thread B’s result that gets lost, or there could be no error at all. Because they are unpredictable, thread interference bugs can be difficult to detect and fix.

Design Pattern – Singleton Pattern

Singleton pattern is one of the simplest design patterns in Java. This type of design pattern comes under creational pattern as this pattern provides one of the best ways to create an object.

This pattern involves a single class which is responsible to create an object while making sure that only single object gets created. This class provides a way to access its only object which can be accessed directly without need to instantiate the object of the class.

Implementation

We’re going to create a SingleObject class. SingleObject class have its constructor as private and have a static instance of itself.

SingleObject class provides a static method to get its static instance to outside world. SingletonPatternDemo, our demo class will use SingleObjectclass to get a SingleObject object.

Singleton Pattern UML Diagram

Step 1

Create a Singleton Class.

SingleObject.java

public class SingleObject {

   //create an object of SingleObject
   private static SingleObject instance = new SingleObject();

   //make the constructor private so that this class cannot be
   //instantiated
   private SingleObject(){}

   //Get the only object available
   public static SingleObject getInstance(){
      return instance;
   }

   public void showMessage(){
      System.out.println("Hello World!");
   }
}

Step 2

Get the only object from the singleton class.

SingletonPatternDemo.java

public class SingletonPatternDemo {
   public static void main(String[] args) {

      //illegal construct
      //Compile Time Error: The constructor SingleObject() is not visible
      //SingleObject object = new SingleObject();

      //Get the only object available
      SingleObject object = SingleObject.getInstance();

      //show the message
      object.showMessage();
   }
}

Step 3

Verify the output.

Hello World!

Wifi Direct:群组问题

1.在 4.2 以上版本,如果直连配对成功过,设置里会显示已保存的群组,以后这两台手机连接就无需确认,可以直接连上,不会弹出对话框让用户选择同意或是拒绝。4.0 里还没有群组这个东西。

2.4.2版本的 WifiP2pSettings 源码中,连接成功后,会默认自动记下这个群组。

在 WifiPpManager 中,当有两设备连成功构成组后:

1
2
3
4
5
6
case WifiP2pManager.RESPONSE_PERSISTENT_GROUP_INFO:
    WifiP2pGroupList groups = (WifiP2pGroupList) message.obj;
    if (listener != null) {
        ((PersistentGroupInfoListener) listener).onPersistentGroupInfoAvailable(groups);
    }
break;

而 onPersistentGroupInfoAvailable(groups) 是这个类中的内部接口中定义的方法,在 WifiP2pSettings 中实现了这个接口,重写了方法

1
2
3
4
5
6
7
public void onPersistentGroupInfoAvailable(WifiP2pGroupList groups) {
    mPersistentGroup.removeAll();
    for (WifiP2pGroup group: groups.getGroupList()) {
        if (DBG)     Log.d(TAG, " group " + group);
        mPersistentGroup.addPreference(new WifiP2pPersistentGroup(getActivity(), group));
    }
}

得到组的列表,然后将组添加到界面上

3.群组封装的对象

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
public class WifiP2pPersistentGroup extends Preference {
    public WifiP2pGroup mGroup;
    public WifiP2pPersistentGroup(Context context, WifiP2pGroup group) {
        super(context);
        mGroup = group;
    }
    @Override
    protected void onBindView(View view) {
        setTitle(mGroup.getNetworkName());
        super.onBindView(view);
    }
    int getNetworkId() {
        return mGroup.getNetworkId();
    }
    String getGroupName() {
        return mGroup.getNetworkName();
    }
}
// 如果点击了群组,弹出对话框问是否要删除群组
else if (preference instanceof WifiP2pPersistentGroup) {
    mSelectedGroup = (WifiP2pPersistentGroup) preference;
    showDialog(DIALOG_DELETE_GROUP);
}
else if (id == DIALOG_DELETE_GROUP) {
    int stringId = R.string.wifi_p2p_delete_group_message;
    AlertDialog dialog = new AlertDialog.Builder(getActivity())
        .setMessage(getActivity().getString(stringId))
        .setPositiveButton(getActivity().getString(R.string.dlg_ok), mDeleteGroupListener)
        .setNegativeButton(getActivity().getString(R.string.dlg_cancel), null)
        .create();
    return dialog;
}

4.删除群组

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
//delete persistent group dialog listener
mDeleteGroupListener = new OnClickListener() {
@Override
public void onClick(DialogInterface dialog, int which) {
    if (which == DialogInterface.BUTTON_POSITIVE) {
        if (mWifiP2pManager != null) {
            mWifiP2pManager.deletePersistentGroup(mChannel,
                mSelectedGroup.getNetworkId(),
                new WifiP2pManager.ActionListener() {
                    public void onSuccess() {
                        if (DBG) Log.d(TAG, " delete group success");
                    }
                    public void onFailure(int reason) {
                        if (DBG) Log.d(TAG, " delete group fail " + reason);
                    }
                });
            }
        }
    }
};

5.manager.requestGroupInfo(channel, activity) 会调用 GroupInfoListener 接口里的 onGroupInfoAvailable 方法,得到组的信息。GroupInfoListener 这个接口要实现 onGroupInfoAvailable(WifiP2pGroup group) 方法,group 里包含一个 Group Owner 和多个 Group Client 的信息。

Wifi Direct:功能测试

1.WiFi 直连中收到 WIFI_P2P_STATE_CHANGED_ACTION 广播时,更改直连状态,根据状态不同执行相应操作。如果 WiFi 未开启,要跳到设置界面去设置,本来想如果设置返回还要判断执行,所以放到 onResum 方法里执行,但这样有问题,这个界面还可能跳到其它界面,状态可能没改变,而且 onResume 比广播接收者要调用的方法执行的更早,所以 onResume 里用到的 isWifiP2pEnabled 并不是最新的。所以后来放在更改状态之后执行,这样只要更改状态就执行。

        广播接收者收到广播调用Activity中的方法,设置状态后,把自己想要紧接着做的封装到一个方法。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
public void setIsWifiP2pEnabled(boolean isWifiP2pEnabled) {
    this.isWifiP2pEnabled = isWifiP2pEnabled;
    // 状态变化了,调用这个方法
    testAndDiscover();
}
// 看WiFi功能是否开启,若开启,去发现设备
private void testAndDiscover() {
 
    if(!isWifiP2pEnabled) {
        AlertDialog.Builder builder = new Builder(this);
        builder.setTitle("提示")
        .setMessage("请先确认您的设备支持WiFi直连功能。如果支持,请先在设置中开启WiFi")
        .setPositiveButton("去设置"new DialogInterface.OnClickListener() {
 
            @Override
            public void onClick(DialogInterface dialog, int which) {
                startActivity(new Intent(Settings.ACTION_WIRELESS_SETTINGS));
            }
        })
        .setNegativeButton("不弄了"new DialogInterface.OnClickListener() {
 
            @Override
            public void onClick(DialogInterface dialog, int which) {
 
            }
        }).show();
    else // 如果wifi直连功能支持并已开启
        WifiDirectUtil.discoverPeers(DeviceListActivity.this, manager, channel);
    }
}

2.广播接收者里收到的 WIFI_P2P_THIS_DEVICE_CHANGED_ACTION 这个广播是更新自己的状态

1
2
3
4
5
if (WifiP2pManager.WIFI_P2P_THIS_DEVICE_CHANGED_ACTION.equals(action)) {
    // 更新自己的设备状态
    activity.updateThisDevice((WifiP2pDevice) intent
            .getParcelableExtra(WifiP2pManager.EXTRA_WIFI_P2P_DEVICE));
}
        收到广播的意图intent中保存了自己这个设备的信息,所以通过(WifiP2pDevice) intent.getParcelableExtra(WifiP2pManager.EXTRA_WIFI_P2P_DEVICE 来得到自己这台设备 WifiP2pDevice 对象。
        更新自己的设备状态
1
2
3
4
5
6
public void updateThisDevice(WifiP2pDevice device) {
    TextView myName = (TextView) findViewById(R.id.my_name);
    TextView myStatus  = (TextView) findViewById(R.id.my_status);
    myName.setText(device.deviceName);
    myStatus.setText(Util.getDeviceStatus(device.status));
}

其中 device.status 是int型值,所以要转变为对应文字,这样看着有意义

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
public static String getDeviceStatus(int deviceStatus) {
    switch (deviceStatus) {
    case WifiP2pDevice.AVAILABLE:
        return "Available";
    case WifiP2pDevice.INVITED:
        return "Invited";
    case WifiP2pDevice.CONNECTED:
        return "Connected";
    case WifiP2pDevice.FAILED:
        return "Failed";
    case WifiP2pDevice.UNAVAILABLE:
        return "Unavailable";
    default:
        return "Unknown";
    }
}

3.原来 discoverPeers,requestPeers 之类的方法都是放在Activity里,但如果作为一个库供其它程序调用的话要把这些方法提取出来放到一个工具类里。比如

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
/**
   * 去发现设备
   * @param context
   * @param manager
   * @param channel
   */
public static void discoverPeers(final Context context, WifiP2pManager manager, Channel channel) {
    Log.i(TAG, "discoverPeers");
    manager.discoverPeers(channel, new ActionListener() {
 
        @Override
        public void onSuccess() {
            // 成功后,会收到对应广播
        }
 
        @Override
        public void onFailure(int reason) {
            Toast.makeText(context, "搜索设备失败,错误代码:" + reason,
                    Toast.LENGTH_SHORT).show();
        }
    });
}
 
/**
   * 更新自己这台设备的状态
   * @param activity
   * @param device
   */
public static void updateThisDevice(Activity activity, WifiP2pDevice device) {
    Log.i(TAG, "updateThisDevice");
    TextView myName = (TextView) activity.findViewById(R.id.my_name);
    TextView myStatus  = (TextView) activity.findViewById(R.id.my_status);
    myName.setText(device.deviceName);
    myStatus.setText(getDeviceStatus(device.status));
}
        这样在 Activity 和 BroadcastReceiver 中只要调用这些方法,并传入对应参数即可。
        在广播接收者里收到 WIFI_P2P_PEERS_CHANGED_ACTION 这个广播时,要调用 WifiP2pManager 的requestPeers 方法,第二个参数是 PeerListListener 对象,会自动回调里面的 onPeersAvailable 方法。原来是让设备列表的适配器直接实现 PeerListListener,但这样就写死了,所以考虑将这个 requestPeers 方法也提取到工具类里。
        在 onPeersAvailable 方法里得到更新的设备列表,应该去更新适配器的数据并刷新。原来是这样的
1
2
3
4
5
6
7
8
9
10
11
@Override
public void onPeersAvailable(WifiP2pDeviceList peerList) {
    // peers 就是适配器中用到的那个设备列表集合    
    peers.clear();
    peers.addAll(peerList.getDeviceList());
 
    notifyDataSetChanged(); // 刷新适配器
    if (peers.size() == 0) {
        return;
    }
}

但现在将 PeerListListener 和适配器分离,所以考虑得到最新的列表后去通过方法调用去更改适配器中内容,然后再刷新,所以要在适配器类中添加一个方法,去设置它自己的那个 List 集合,但用户开始不知道一定要有这方法,所以写一个接口

1
2
3
4
5
6
7
8
public interface BaseDeviceAdapter {
    /**
       * 更新适配器中的设备列表那个List集合。并刷新适配器内容
       * @param peers
       */
    public void updateDeviceList(List<WifiP2pDevice> peers);
}

这样自己写的 DeviceAdapter 实现接口,重写这个方法

1
2
3
4
5
6
// 更新peers并刷新适配器
@Override
public void updateDeviceList(List<WifiP2pDevice> peers) {
    this.peers = peers;
    notifyDataSetChanged();
}

在工具类里

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
/**
   * 查找设备
   * @param manager
   * @param channel
   * @param adapter 要更新内容的适配器,类型是自己定义的那个接口
   */
public static void requestPeers(WifiP2pManager manager,  Channel channel, final BaseDeviceAdapter adapter) {
    Log.i(TAG, "requestPeers");
    manager.requestPeers(channel, new PeerListListener() {
 
        @Override
        public void onPeersAvailable(WifiP2pDeviceList peerList) {
            List<WifiP2pDevice> peers = new ArrayList<WifiP2pDevice>();
            peers.clear();
            peers.addAll(peerList.getDeviceList());
 
            // 更新适配器中内容。利用参数传递的那个接口调用
            adapter.updateDeviceList(peers);
            if (peers.size() == 0) {
                return;
            }
        }
    });
}

4.连接一台设备,知道的是WifiP2pDevice,WifiP2pManager 真正去 connect 需要的 WifiP2pConfig 信息通过 WifiP2pDevice 来设置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public static void connect(final Context context, WifiP2pManager manager,
    Channel channel, WifiP2pDevice device) {
    WifiP2pConfig config = new WifiP2pConfig();
    config.deviceAddress = device.deviceAddress;
    config.wps.setup = WpsInfo.PBC;
 
    manager.connect(channel, config, new ActionListener() {
        @Override
        public void onSuccess() {
            // WiFiDirectBroadcastReceiver will notify us. Ignore for now.
        }
 
        @Override
        public void onFailure(int reason) {
            Toast.makeText(context, "连接失败,请重试", Toast.LENGTH_SHORT).show();
        }
    });
}

5.广播接收者收到这个广播 WIFI_P2P_CONNECTION_CHANGED_ACTION 表示连接状态改变了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
if (manager == null) {
    return;
}
 
NetworkInfo networkInfo = (NetworkInfo) intent
                    .getParcelableExtra(WifiP2pManager.EXTRA_NETWORK_INFO);
 
if (networkInfo.isConnected()) { // 成功连接上
    Log.i(TAG, "收到广播:连接成功");
    WifiDirectUtil.requestConnectionInfo(manager, channel);
else {
    // 不管哪一方主动断开连接,这里都会收到广播,在这里去重置数据
    // 比如清空一些数据,或者重新 discoverPeers 显示到界面上
    activity.resetData();
}
        
// 在Activity中,如果直连断了,重新查找设备
public void resetData() {
    WifiDirectUtil.discoverPeers(DeviceListActivity.this, manager, channel);
}
        而在 Activity 中的 onCreate 方法中每次进入都调用 adapter.clearPeers() 先清空适配器中数据,不然上一次的数据不清空会影响下一次的界面显示内容。
        在适配器中
1
2
3
4
public void clearPeers() {
    peers.clear();
    notifyDataSetChanged();
}

在 onDestory 中断开连接,如果原来是连接的,断开成功,如果原来就是断开的,断开失败,反正没影响。

WIFI DIRECT, CONNECTIONS WITHOUT USER INTERACTION

As  I wrote in the UX with Wifi Direct: The connection acceptance dialog article, there is acceptance dialog, which requires User interaction before any data can be exchanged between the devices.

This is not really ideal situation with Thali project, in which we would actually require the connections do be established fully automatically.  To fix this I would have two solutions which might work, depending on requirements:

  1. Teach each device, all “Known groups”, so they never ask for the dialog again
  2. Modify the communications to be handled in a way that no dialogs are ever shown.

For the first fix, we could make an app, which you would start and select which mode it would work (either Advertiser who waits for connections, or a devise that initiates the connecting  process), and the app would then do connection and once its verified to work be exchanging some data, they would revert the roles and do new round. While running both devices should have the dialog shown once, and user would then need to click the accept button to store the information in “Known Groups” settings.

This of course would be required to be handled with each device we can potentially have connections between, and would need to be re-done , in case the device is cleaned.

The second fix would require doing the connection a bit differently, in essence the steps are:

  1.  Use createGroup() fucntion from WifiP2pManager to create a group. This creates an access point, with random SSID and password.
    • Start same time a server which accepts the incoming connections required for your service
  2.  Once the access point is ready, you’ll receive WIFI_P2P_CONNECTION_CHANGED_ACTION. Then fetch the group information to get the SSID & password
  3. Create local service to advertise the access point. You can use the local instance variable for giving the access point information (SSID, password & IP-Address)
  4. Do Service Discovery to find any peers nearby advertising their access points. Once found, see what SSID, password and IP-Address are used with there.
  5.  Use WifiManager for forming the connection (instead of WifiP2pManager )
  6. Once the Connection is established,
    1. Stop advertising for the access point.
    2. Remove the access point
    3. Stop searching for additional access points
  7. Make connection to the IP-Address you got from the service you discovered in step 4.

Here’s coupld of pointers to remember with this approach

  • Its likely that all devices get same IP address when they form the access point, and thus do remember to remove the access point in step 6, otherwise you’ll communicate with your own server in step 7.
  • With Android devices, you can have connection only to one WLAN Access point, thus if you had any active WLAN connections, they will be disconnected when you start connecting in step 5.

I made simple example showing how the fix number two works, and you can find it from Github under the DrJukka/MyWifiMesh. Do note that its not fully finalized, and is just used for proof of concept for further development.

Android平台Wifi_Direct使用

Wifi_Direct是目前设备间最快的无线数据连接方式,速度可以达到40Mb/s。Google从Android 4.0(ICS)开始支持Wifi_Direct,而三星则更早些就在它自己的设备上支持了Wifi_Direct。几年来,Wifi_Direct的发展一直不温不火,但是目前市面上支持Wifi_Direct的设备并不是很多。
        从目前接触过得设备来看,三星I9100的Wifi_Direct功能其实使用了Wifi的硬件,所以,它在使用Wifi_Direct功能时,无法使用wifi;nexus7、Padfone infinite(A80)则有独立的硬件来支持Wifi_Direct,所以,在使用Wifi_Direct功能的时候,Wifi仍旧可用。
          Android framework提供了一个android.net.wifi.p2p包来提供对于Wifi_Direct的支持,其中包含了7个class和9个interface。其中WifiP2pManager为最核心的class,其他的class和interface都为它所用。
          使用Wifi_P2p需要的Permission有两个:

public static final String ACCESS_WIFI_STATE
Added in API level 1
Allows applications to access information about Wi-Fi networks
Constant Value: “android.permission.ACCESS_WIFI_STATE”
public static final String CHANGE_WIFI_STATE
Added in API level 1
Allows applications to change Wi-Fi connectivity state
Constant Value: “android.permission.CHANGE_WIFI_STATE”

Wifi_Direct的大致配对流程如下:

        1. WifiP2pManager.discoverPeers()开始扫描设备
        2. 获取扫描到的设备,选择其中一个设备进行连接配对WifiP2pManager.connect
        3. 配对成功后,根据WifiP2pInfo.isGroupOwner和WifiP2pInfo.groupOwnerAddress进行连接。
        流程图如下:
        个人认为Wifi_Direct配对需要注意的问题:
        1. Setting中启用/关闭WifiP2p按钮,应该是和Wifi的启用/关闭按钮放在一起了(其实,有些设备的实现中,Wifip2p使用的就是wifi的硬件),所以使用WifiP2p功能需要开启Wifi。
        2. Setting中BlueTooth有一个“让自己可见”的按钮,而Wifi_Direct没有这样的设置,仅提供了一个启动scan的按钮。本人尚未明确在未启动scan的情况下,设备对于其他wifi_direct是否是可见的,但是可以明确scan中的wifi_direct设备对其他设备来说是可见的。所以,建议需要进行配对的两台Wifi_Direct设备都进行scan。
        3. 配对成功的前提条件是:进行配对的两台设备都必须能够扫描到对方。所以,两台设备都进行scan操作的根本原因在这里。
        4. 开发者无法决定GroupOwner是哪台设备,但是可以通过WifiP2pConfig.groupOwnerIntent参数进行建议。
        从测试的结果来说,Wifi_Direct的表现受具体设备的影响很大,配对的速度也有较大差异,从10秒到2分钟甚至更久。大概的来说,nexus7成功的概率较高,个人感觉可以达到70%的成功率,Padfone infinite(A80)的成功率在50%以下。
         为了兼容传统的Wifi设备,Wifi_Direct其实还存在另一种使用方式,暂且称为兼容模式。兼容模式的特点在于,只需要担任GroupOwner的设备支持Wifi_Direct,而其他设备只需要支持传统的Wifi就可以了(个人觉得其实这种使用模式很像Android的便携热点功能)。
         操作流程为:
         1. 支持Wifi_Direct的设备创建group,WifiP2pManager.createGroup(),成为GroupOwner。
         2.  其他设备扫描Wifi_Direct设备创建group后产生的Wifi热点并连接即可。
         兼容模式存在的一个问题是:因为作为group member的设备是使用Wifi硬件接入到group中,所以会导致member进行wifi 热点切换以及网络中断,可能对正在进行的网络操作造成影响,而group owner则不存在这个问题。另外,而WifiP2p配对的使用方式,WifiP2p和Wifi可以独立运作,相互不受影响。
          但是,兼容模式因为省去了扫描和配对的过程,所以建立连接的成功率明显提升,并且建立连接的速度要快不少(具体时间比较随机)。
          从个人的使用感觉来讲,这WifiP2p这套API接口高度的异步化,API都需要以回调的方式获取操作结果(包内interface比较多的原因就在于此)。更加麻烦的是,几个关键API(例如WifiP2pManager.connect)的回调获取到的结果仅仅是执行是否开始,真正的结果还得注册broadcast receiver,通过监听广播来获得,才能进行下一步操作。异步的设计提高了代码的逻辑复杂度。
         使用NFC来实现WifiP2p的连接:
         1. 使用NFC将owner设备创建的group的SSID和密码传递给member设备
         2. owner开始监听指定端口,等待member的连接
         3. member接收到nfc传递过来的数据后,根据SSID和密码连接到group
         4. 连接成功以后,过去owner设备的ip地址(获取gateway ip即可),连接到owner的指定端口
          常见问题:
          1. WifiP2p相关的广播有哪些,各自有哪些参数?
WifiP2pManager.WIFI_P2P_DISCOVERY_CHANGED_ACTION:当WifiP2p扫描开始或者停止时,触发该广播
该广播包含一个int型extra, key为WifiP2pManager.EXTRA_DISCOVERY_STATE,其值为WifiP2pManager.WIFI_P2P_DISCOVERY_STARTED或者WifiP2pManager.WIFI_P2P_DISCOVERY_STARTED.
WifiP2pManager.WIFI_P2P_STATE_CHANGED_ATIONIC:当WifiP2p状态发生变化时触发(如果WifiP2p可用,那么当BroadcastReceiverregister时,也会收到该广播)
该广播包含一个int型extra,key为WifiP2pManager.EXTRA_WIFI_STATE,其值为WifiP2pManager.WIFI_P2P_STATE_ENABLED或者WifiP2pManager.WIFI_P2P_STATE_DISABLED。
WifiP2pManager.WIFI_P2P_THIS_DEVICE_CHANGED_ACTION:当设备的WifiP2p状态发生变化时触发广播(如果WifiP2p可用,那么当BroadcastReceiverregister时,也会收到该广播)
该广播包含一个类型为WifiP2pDevice的extra,key为WifiP2pManager.EXTRA_WIFI_P2P_DEVICE.
WifiP2pManager.WIFI_P2P_PEERS_CHANGED_ACTION:当WifiP2p扫描时,发现device列表发生变化时,触发该广播
该广播不含extra,开发者应该接收到此广播后,调用WifiP2pManager.requestPeers()函数查询当前设别列表。
WifiP2pManager.WIFI_P2P_CONNECTION_CHANGED_ACTION:当WifiP2p的group发生变化时,触发该广播。
该广播包含两个extra:
key:WifiP2pManager.EXTRA_NETWORK_INFO,其值为NetworkInfo类型。
key:WifiP2pManager.EXTRA_P2P_INFO,其值为WifiP2pInfo类型。
PS:这里的WifiP2p group发生变化包含如下情况:
1. 建立group
2. member加入到group
3. member退出group
4. 关闭group
        2. 如何获得WifiP2pGroupInfo,它有什么用?
WifiP2pManager.requestGroupInfo()函数,可以获取GroupInfo,较为有用的api有:
1. GroupInfo.getClientList()可以获得连接到group的member列表
2. GroupInfo.getNetWorkName()可以获得group的wifi热点名称(SSID)
3. GroupInfo.getPassphrase() 可以获得连接到wifi 热点的密码
        3. 如何获得WifiP2pInfo?
可以从WifiP2pManager.WIFI_P2P_CONNECTION_CHANGED_ACTION广播中的extra中获取
也可以从WifiP2pManager.requestConnectionInfo()函数获取。
        4. 如何防止配对产生的提示框?
在不修改framework的情况下,本人暂时为找到可行的方案。
这个提示狂是由系统提供的,具体表现视设备而定。nexus只在第一次配对的时候弹出,而A80每一次配对都会弹出。
但是,使用兼容模式使用Wifi_Direct是没有提示框的。
         5. 如何实现wifi热点的连接?
 经过测试,在A80上,如下代码可以实现连接到热点。

  1. // build a wifi config
  2. final WifiConfiguration config = new WifiConfiguration();
  3. config.allowedAuthAlgorithms.clear();
  4. config.allowedPairwiseCiphers.clear();
  5. config.allowedGroupCiphers.clear();
  6. config.allowedKeyManagement.clear();
  7. config.allowedProtocols.clear();
  8. config.SSID = “\”” + ssid + “\””;<span style=“color:#009900;”>//设定ssid</span>
  9. config.preSharedKey = “\”” + pw + “\””;<span style=“color:#009900;”>//设定密码</span>
  10. config.hiddenSSID = false;
  11. config.status = WifiConfiguration.Status.ENABLED;
  12. config.priority = 1;
  13. config.allowedGroupCiphers.set(WifiConfiguration.GroupCipher.TKIP);
  14. config.allowedGroupCiphers.set(WifiConfiguration.GroupCipher.CCMP);
  15. config.allowedGroupCiphers.set(WifiConfiguration.GroupCipher.WEP104);
  16. config.allowedGroupCiphers.set(WifiConfiguration.GroupCipher.WEP40);
  17. config.allowedKeyManagement.set(WifiConfiguration.KeyMgmt.WPA_PSK);
  18. config.allowedPairwiseCiphers.set(WifiConfiguration.PairwiseCipher.TKIP);
  19. config.allowedPairwiseCiphers.set(WifiConfiguration.PairwiseCipher.CCMP);
  20. config.allowedPairwiseCiphers.set(3);
  21. config.allowedProtocols.set(WifiConfiguration.Protocol.RSN);
  22. config.allowedProtocols.set(WifiConfiguration.Protocol.WPA);
  23. // connect to ap
  24. int id = WifiManager.addNetwork(config);
  25. config.networkId = id;
  26. if (id != –1 && mWifiManager.enableNetwork(id, true)) {

What Is DHCP?

Dynamic Host Configuration Protocol (DHCP) is a client/server protocol that automatically provides an Internet Protocol (IP) host with its IP address and other related configuration information such as the subnet mask and default gateway. RFCs 2131 and 2132 define DHCP as an Internet Engineering Task Force (IETF) standard based on Bootstrap Protocol (BOOTP), a protocol with which DHCP shares many implementation details. DHCP allows hosts to obtain required TCP/IP configuration information from a DHCP server.

Windows Server® 2008 includes the DHCP Server service, which is an optional networking component. All Windows-based clients include the DHCP client as part of TCP/IP, including Windows Vista®, the Windows Server®°2003 operating system, the Windows® XP Professional operating system, Microsoft Windows®°2000 Professional operating system, Microsoft Windows°NT® Workstation°4.0 operating system, Microsoft Windows® Millennium Edition operating system, and the Microsoft Windows®°98 operating system.

Every device on a TCP/IP-based network must have a unique unicast IP address to access the network and its resources. Without DHCP, IP addresses for new computers or computers that are moved from one subnet to another must be configured manually; IP addresses for computers that are removed from the network must be manually reclaimed.

With DHCP, this entire process is automated and managed centrally. The DHCP server maintains a pool of IP addresses and leases an address to any DHCP-enabled client when it starts up on the network. Because the IP addresses are dynamic (leased) rather than static (permanently assigned), addresses no longer in use are automatically returned to the pool for reallocation.

The network administrator establishes DHCP servers that maintain TCP/IP configuration information and provide address configuration to DHCP-enabled clients in the form of a lease offer. The DHCP server stores the configuration information in a database that includes:

  • Valid TCP/IP configuration parameters for all clients on the network.
  • Valid IP addresses, maintained in a pool for assignment to clients, as well as excluded addresses.
  • Reserved IP addresses associated with particular DHCP clients. This allows consistent assignment of a single IP address to a single DHCP client.
  • The lease duration, or the length of time for which the IP address can be used before a lease renewal is required.

A DHCP-enabled client, upon accepting a lease offer, receives:

  • A valid IP address for the subnet to which it is connecting.
  • Requested DHCP options, which are additional parameters that a DHCP server is configured to assign to clients. Some examples of DHCP options are Router (default gateway), DNS Servers, and DNS Domain Name. For a full list of DHCP options, see DHCP Tools and Options.

In Windows Server 2008, the DHCP Server service provides the following benefits:

  • Reliable IP address configuration. DHCP minimizes configuration errors caused by manual IP address configuration, such as typographical errors, or address conflicts caused by the assignment of an IP address to more than one computer at the same time.
  • Reduced network administration. DHCP includes the following features to reduce network administration:
    • Centralized and automated TCP/IP configuration.
    • The ability to define TCP/IP configurations from a central location.
    • The ability to assign a full range of additional TCP/IP configuration values by means of DHCP options.
    • The efficient handling of IP address changes for clients that must be updated frequently, such as those for portable computers that move to different locations on a wireless network.
    • The forwarding of initial DHCP messages by using a DHCP relay agent, which eliminates the need for a DHCP server on every subnet.

Wi-Fi Direct vs. Bluetooth

Wi-Fi Direct (formerly Wi-Fi P2P) is appearing in more and more smartphones these days and despite not having used it so far for anything useful I thought I’d investigate a bit more to see how it works and how it differs from Bluetooth in addition to the obviously higher data transfer rates.

Unlike many other Wi-Fi functionalities, Wi-Fi Direct is not specified by the IEEE. Instead, the Wi-Fi Alliance, best known for it’s Wi-Fi certification program and logo, was responsible for the feature. It’s not their first one, they were also the driving force to fix the WEP encryption issue many years ago with WPA and later WPA2. Also, they have defined the set of rules for Wireless Multi Media (WMM) and other options in the IEEE standards to ensure the implementation of a minimum set of features and interoperability between devices.

In essense the Wi-Fi direct feature is straight forward. While traditional Wi-Fi networks require an access point for devices to communicate with each other, Wi-Fi direct allows two devices to communicate with each other without a dedicated access point. Instead, one of the two devices assumes the role of the access point and becomes the Group Owner (GO) of the Wi-Fi direct network. Other devices, even non-Wi-Fi direct devices, can then join this group as the GO behaves just like a standard access point.  This means that the GO device also includes DHCP functionality to assign IP addresses to clients of the group network.

Once the Wi-Fi connection is established and an IP address has been assigned the standard TCP/IP protocol stack is used to transfer data between devices. And this is the biggest difference between Bluetooth and Wi-Fi direct. While Bluetooth defines profiles for transferring images, business cards, calendar entries, audio signals, etc., Wi-Fi direct itself only offers a transparent IP channel. To transfer data between two devices, compatible apps are required.

The advantage of this approach is that those apps can work in Wi-Fi direct networks and also in traditional Wi-Fi environments. A TV, for example, that is capable of traditional Wi-Fi and Wi-Fi direct can run a server application to receive pictures and video streams over both Wi-Fi variants. The home owner would stream his material over the traditional Wi-Fi network while visitors would use Wi-Fi direct to skip the somewhat complicated process of joining the local wireless infrastructure network.

The disadvantage of this approach in the example above is that the visitor has to first download a client app that can communicate with the server app on the television. While this might be acceptable for the scenario above, it’s too complicated for just exchanging a few images, files or contacts on the go between two smartphones. Perhaps Google will add such apps to Android in the future to make this easier but this wouldn’t help transferring files to the iPhone, Blackberries, Windows Phone, etc. This is where Bluetooth still shines due to its standardized profiles which are implemented on many operating systems. This is a bit unfortunate as transferring multimedia content between different mobile operating systems would definitely benefit from fast Wi-Fi transmissions due to ever increasing file sizes and the practical transfer speeds of Bluetooth of only around 2 Mbit/s.

In practice, Wi-Fi direct has arrived in Android smartphones today but is not really noticed so far. The Android OS offers an API for apps from version 4 of the OS to scan for Wi-Fi direct devices and then to establish a connection. That offers interesting possibilities for ad-hoc communication, such as for example local multi-player games. Think kids in cars on a long road trip…

Service vs intent service

In short, a Service is a broader implementation for the developer to set up background operations, while an IntentService is useful for “fire and forget” operations, taking care of background Thread creation and cleanup.

From the docs:

Service A Service is an application component representing either an application’s desire to perform a longer-running operation while not interacting with the user or to supply functionality for other applications to use.

IntentService IntentService is a base class for Services that handle asynchronous requests(expressed as Intents) on demand. Clients send requests through startService(Intent) calls; the service is started as needed, handles each Intent in turn using a worker thread, and stops itself when it runs out of work.

When to use?

  • The Service can be used in tasks with no UI, but shouldn’t be too long. If you need to perform long tasks, you must use threads within Service.
  • The IntentService can be used in long tasks usually with no communication to Main Thread. If communication is required, can use Main Thread handler or broadcast intents. Another case of use is when callbacks are needed (Intent triggered tasks).

How to trigger?

  • The Service is triggered by calling method startService().
  • The IntentService is triggered using an Intent, it spawns a new worker thread and the method onHandleIntent() is called on this thread.

Triggered From

  • The Service and IntentService may be triggered from any thread, activity or other application component.

Runs On

  • The Service runs in background but it runs on the Main Thread of the application.
  • The IntentService runs on a separate worker thread.

Limitations / Drawbacks

  • The Service may block the Main Thread of the application.
  • The IntentService cannot run tasks in parallel. Hence all the consecutive intents will go into the message queue for the worker thread and will execute sequentially.

When to stop?

  • If you implement a Service, it is your responsibility to stop the service when its work is done, by calling stopSelf() or stopService(). (If you only want to provide binding, you don’t need to implement this method).
  • The IntentService stops the service after all start requests have been handled, so you never have to call stopSelf().